---
jupytext:
  formats: ipynb,md:myst
  text_representation:
    extension: .md
    format_name: myst
    format_version: 0.13
    jupytext_version: 1.13.6
kernelspec:
  display_name: Python 3 (ipykernel)
  language: python
  name: python3
---

```{code-cell}
:tags: [hide-cell]

import mmf_setup;mmf_setup.nbinit()
import logging;logging.getLogger('matplotlib').setLevel(logging.CRITICAL)
%matplotlib inline
import numpy as np, matplotlib.pyplot as plt
```

(sec:PythonPerformance)=
Python Performance
==================


> We *should* forget about small efficiencies, say about 97% of the time: premature
> optimization is the root of all evil.
>
> Yet we should not pass up our opportunities in that critical 3%.
>
> -- [Donald Knuth](https://doi.org/10.1145/356635.356640)

## Timeit

Before optimizing code for performance, one must measure the performance.  Generally
this should be done with a complete example of performance-critical code so that the
real bottlenecks can be found.  However, sometimes, it is useful to see how long various
tasks take -- especially ones that might be used frequently.  IPython provides a nice
little tool for this spelled `%timeit`.

Here is a little example comparing the cost of different ways to check if a class has a
particular attribute.

```{code-cell}

class A:
    """This class does not have the attributes, or they are False."""
    a = False
    
    def __init__(self):
        self.b = False
    
    @property
    def c(self):
        return False

class B:
    """This class has the attributes, or they are True."""
    a = True
    
    def __init__(self):
        self.b = True
        
    @property
    def c(self):
        return True

    @property
    def d(self):
        pass

a, b = A(), B()
print("Instance variable")
%timeit bool(a.b)
%timeit bool(b.b)

print("\nClass variable")
%timeit bool(a.a)
%timeit bool(b.a)

print("\nhasattr")
%timeit hasattr(a, 'd')
%timeit hasattr(b, 'd')

print("\nproperty")
%timeit bool(a.c)
%timeit bool(b.c)
```

At the time of running, this indicates that property access is the slowest (50ns /
access) and is about 2.7 times slower than instance variable lookup.  Class-variable
access is slower because it first has to fail to find the attribute in the instance.
I am not sure why `hasattr` access is significantly slower when the attribute exists!
In any case, all of these happen fast enough not to worry unless it is in the core of a
highly repeated loop.  Don't bother optimizing your code for this unless it is really
part of that 3% bottleneck.

:::{admonition} Project: estimate the errors in `timeit`.
A fun problem is to determine how many times to repeat the measurement of code
performance -- i.e. to determine the uncertainty in the measurement.  The
[%timeit][] magic simply uses the standard deviation of a certain number
of repeated measurements.  Can you do better?  Some things to consider:
* What is the source of the noise?  Usually this is the computer doing other things
  while measuring the performance.  If this is the case, then you might want to know the
  minimum of the measurements rather than the mean.  An interesting related problem is
  to characterize the error of the minimum of set of $N$ measurements.  I have tried
  playing with this, but without a huge amount of success.
* On the other hand, you will typically be executing your code while other processes
  are running.  Depending on how your code interacts with these other processes, perhaps
  an average is more appropriate.

A great solution would measure as many times as needed to realize a certain tolerance in
the error estimate (bailing out if such a tolerance goal could not be achieved.)

[%timeit]: <https://ipython.readthedocs.io/en/stable/https://interactive/magics.html#magic-timeit>
:::
