Python Performance#
We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil.
Yet we should not pass up our opportunities in that critical 3%.
Timeit#
Before optimizing code for performance, one must measure the performance. Generally
this should be done with a complete example of performance-critical code so that the
real bottlenecks can be found. However, sometimes, it is useful to see how long various
tasks take – especially ones that might be used frequently. IPython provides a nice
little tool for this spelled %timeit.
Here is a little example comparing the cost of different ways to check if a class has a particular attribute.
class A:
"""This class does not have the attributes, or they are False."""
a = False
def __init__(self):
self.b = False
@property
def c(self):
return False
class B:
"""This class has the attributes, or they are True."""
a = True
def __init__(self):
self.b = True
@property
def c(self):
return True
@property
def d(self):
pass
a, b = A(), B()
print("Instance variable")
%timeit bool(a.b)
%timeit bool(b.b)
print("\nClass variable")
%timeit bool(a.a)
%timeit bool(b.a)
print("\nhasattr")
%timeit hasattr(a, 'd')
%timeit hasattr(b, 'd')
print("\nproperty")
%timeit bool(a.c)
%timeit bool(b.c)
Instance variable
22.8 ns ± 0.0798 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)
22.5 ns ± 0.186 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)
Class variable
22.7 ns ± 0.0376 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)
22.4 ns ± 0.0374 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)
hasattr
29 ns ± 0.152 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)
49.7 ns ± 0.15 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)
property
32.7 ns ± 0.0965 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)
32.7 ns ± 0.221 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)
At the time of running, this indicates that property access is the slowest (50ns /
access) and is about 2.7 times slower than instance variable lookup. Class-variable
access is slower because it first has to fail to find the attribute in the instance.
I am not sure why hasattr access is significantly slower when the attribute exists!
In any case, all of these happen fast enough not to worry unless it is in the core of a
highly repeated loop. Don’t bother optimizing your code for this unless it is really
part of that 3% bottleneck.
Project: estimate the errors in timeit.
A fun problem is to determine how many times to repeat the measurement of code performance – i.e. to determine the uncertainty in the measurement. The %timeit magic simply uses the standard deviation of a certain number of repeated measurements. Can you do better? Some things to consider:
What is the source of the noise? Usually this is the computer doing other things while measuring the performance. If this is the case, then you might want to know the minimum of the measurements rather than the mean. An interesting related problem is to characterize the error of the minimum of set of \(N\) measurements. I have tried playing with this, but without a huge amount of success.
On the other hand, you will typically be executing your code while other processes are running. Depending on how your code interacts with these other processes, perhaps an average is more appropriate.
A great solution would measure as many times as needed to realize a certain tolerance in the error estimate (bailing out if such a tolerance goal could not be achieved.)