Project Ideas#

Here are some ideas about how to use some of the things taught in this course to achieve some interesting results.

Improve solve_ivp#

Difficulty: Medium

There are issues with solve_ivp, especially with the adaptive refinement (see the issues below). One could try to improve this in several ways. Low-hanging fruit would be to develop some tests to help understand this issue.

import math
import numpy as np
%timeit math.sin(1.2)
%timeit np.sin(1.2)
20.7 ns ± 0.0976 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)
75.2 ns ± 0.27 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)
xs = np.linspace(0, 1, 1000)
%timeit [math.sin(x) for x in xs]
%timeit np.sin(xs)
44 μs ± 151 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
4.77 μs ± 43.3 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)

Improved timeit:#

Difficulty: Medium

The timeit module provides code for measuring the performance of a chunk of code. To improve the reliability of the results, it runs your code a specified number of times. The IPython %timeit magic improves this by provide some simple machinery to estimate how big number should be, balancing accuracy vs. runtime. It also provides the average and standard deviation of repeat measurements to give an idea of the variability.

The project is to write an improved version that determines how many repeats are needed to attain a certain uncertainty with a desired confidence level. Additionally, consider if the mean is the appropriate measure. In many contexts one would prefer the minimum.

To make this problem definite: suppose that the measured runtime is a random variable \(T\) drawn from some distribution \(\rho(T)\).

  1. If this supposition is true, then how will the minimum of \(N\) samples (repeated measurements) be distributed?

  2. As you collect data, you can construct the empirical distribution function (EDF). How can you use this to determine when you have obtained enough samples to provide a reliable estimate of the minimum within a specified confidence interval?

  3. Use MCMC to validate your analytic results, then test your procedure. Is the assumption that \(T\) can be described by a fixed distribution valid?

  4. If not, can you make an improved model?

Support Multiple Kernels in MyST-NB or MyST-MD#

Difficulty: Hard

I would like to be able to compare code run in Python, C++, Julia, etc. in a single MyST Markdown document. For example, I would like to verify that the three sets of example code on the course homepage work by actually running them. This requires having the ability to run Python, Octave/Matlab, and C++ in the same document.

Under the hood, I believe that the documents are executed as Jupyter Notebooks, which seems to imply that we might need a way to support multiple kernels in a single notebook. I do not think this functionality exists generally but see Script of Scripts (SoS).

Parts of this project:

  1. Figure out how technically this might be done.

  2. Converge on a good syntax for this: please engage various members of the community including CoCalc, SoS, and Executable Books to see if a unified syntax can be used. See executablebooks discussion #1137 for contacts to reach out to.

  3. Implement something that works for us.

References#

Testing Notebook Cells#

Difficulty: Medium

I would like to be able to write doctests or similar in my notebooks/markdown files and have them execute.

References#