Physics 581: Compute Anything –
Effective Scientific Computing#
Who: Graduate students or upper-level undergraduates from across the university with a strong preparation in mathematics (strong calculus and linear algebra with exposure to differential equations). Programming experience would be helpful, but we will provide support those with weaker backgrounds.
When: MW(F)2-4pm: Fridays are part of the iSciMath Coffee Hours and will be for working on problems, discussions, etc.
Imagine designing a rocket: You need to minimize weight, but maintain strength. Do you need to start a million-dollar program to model structures, or can you use computer modeling to help? What will get you to space fastest, and keep you in one piece? Compute Anything will lower the barrier for computing, allowing you to quickly explore problems as both a programmer and a scientist. We will show you that programming can fun, dynamic, and interactive, accelerating your research and your career.
If you have any questions, please contact Michael McNeil Forbes m.forbes@wsu.edu.
FAQ#
Q: What language are we using?
A: Python with the NumPy, SciPy, and Matplotlib packages. However, if you have experience with another language, we encourage multilinguality.
Why Python?
Python has many advantages: It is well designed, in high demand at labs and in industry, has excellent library support and a strong and supportive community. It can be slow, but there are many opportunities for optimization and with these, is suitable for many demanding applications. Matlab is a serious contender, but can be expensive and the language itself is not as well-designed as Python. For high-performance code, Julia is a good option, as are Fortran and C/C++, but the latter are not as useful for quickly exploring.
Q: I’ve already taken a programming course. Why should I take this one?
A: This course will teach you much more than just programming. We will teach you programming best-practices like version control and documentation for reproducibility, and testing to ensure correctness – techniques you can directly use to improve your research. By the end of the course, you will have an online programming portfolio, the skills to write a paper or thesis with publication quality figures, and know how to reliably and reproducibly explore data.
Q: What will class be like?
A: We will typically start with a brief lecture establishing the mathematical underpinning of a technique. Then we will quickly implement simple examples together to ensure that things work as expected and develop intuition. Once these basics are established we will explore simply problems, then try to break our code with edge-cases as we write tests. Later in the course, we will revisit and use our validated code to explore more complex problems, profiling and optimizing as needed for performance, or targeting our code for high-performance computing (HPC) clusters.
Q: Do I need to be an experience programmer?
A: While programming experience will help, we will provide support for those with limited programming experience to help you get up to speed. If you are concerned, please reach out for some introductory materials to help you get started.
Q: I don’t need any more courses. Why should I take this class?
A: This class is aimed at improving your productivity. If your research is well under way, then you are encouraged to bring a relevant project to the class, and will receive credit for completing a research-relevant programming project demonstrating the principles of pragmatic programming: i.e., a version-controlled repository of well-documented and well tested code aimed at helping your research, with a skeleton paper or thesis meeting all the technical and formatting requirements of a relevant journal or the graduate school.
For example, in addition to the technical Learning Outcomes at the heart of scientific computing, we will emphasize the following concepts:
Language: At a basic level, we must be able to tell computers what we want them to do, but this is only one function of programming languages: they also enable you to think differently. I thus strongly encourage you to become multilingual, especially learning languages from different programming paradigms.
Real Examples: Vectorization
Language fundamentally influences how we think[1], and programming languages help me think about scientific problems. For a concrete example, MATLAB and NumPy (in Python) induced me to think about algorithbms in terms of vectorization, working with arrays as a whole rather than worrying about components and for loops. This is the programming analog of index-free notation in mathematics (another “language”) for describing geometry and tensors.
Consider the following mathematical expressions for matrix multiplication and their equivalent representation in Python (using NumPy):
\[\begin{gather*} C_{ml} = \sum_{m=0}^{M-1} A_{mn} B_{nl} \end{gather*}\]import numpy as np A = ... B = ... M, N = A.shape _, L = B.shape C = np.zeros((M, L)) for m in range(M): for l in range(L): for n in range(N): C[m, l] += A[m, n] * B[n, l]
A = ... B = ... [M, N] = size(A); [~, L] = size(B); C = zeros(M, L); for m = 1:M for l = 1:L: for n = 1:N: C(m, l) += A(m, n) * B(n, l); end end end
#include <iostream> #include <vector> using Matrix = std::vector<std::vector<double>>; Matrix A...; Matrix B...; size_t M = A.size(); size_t N = A[0].size(); size_t L = B[0].size(); Matrix C(M, std::vector<double>(L, 0)); for (size_t m = 0; m < M; ++n) { for (size_t l = 0; l < L; ++l) { for (size_t n = 0; n < N; ++n) { C[m][l] += A[m][n] * B[n][l]; } } }
\[\begin{gather*} C^{m}{}_{l} = A^{m}{}_{n}B^{n}{}_{l} \end{gather*}\]Here we simplify the notation using the Einstein summation convention where repeated indices are implicitly summed over (called contraction). (In more complicated geometries (i.e. curved space-time), the vertical placement of indices is important: contractions are only valid between one upper and one lower index.)
import numpy as np C = np.einsum('mn,nl->ml', A, B)
\[\begin{gather*} \mat{C} = \mat{A}\mat{B} \end{gather*}\]C = A @ B
C = A * B
// Using the Armadillo library: // see https://en.wikipedia.org/wiki/Armadillo_(C++_library) #include <armadillo> arma::mat A = ...; arma::mat B = ...; arma::mat C = A*B;
Which do you like better? Index-free notations are clearly best for matrix multiplication, but with higher-dimensional tensors (more than 2 indices), the index notation and
numpy.einsum()can be very helpful. Likewise, if you need to do tricks like skipping through non-contiguous indices, or working with heterogeneous structures like trees and graphs, then explicit loops might be the only feasible option (but see Awkward Array).
Development Process and Optimization: Computers tend to be unforgiving of mistakes. The software engineering community has developed processes to reduce these mistakes, and the same ideas can be used to optimize your research. For example, agile development, documentation, testing, and debugging all have analogous roles in a vibrant research program. These ideas, espoused in The Pragmatic Programmer, will be emphasized in the course. Students will be taught to optimize both their code, and their development practices, to minimize the time to solution while ensuring correctness.
Correctness: Quickly getting results is good, but ensuring that those results are correct is even more important. We will focus on verifying results with comprehensive testing and continuous integration (CI), enabling you to confidently use library codes and tools like LLMs (ChatGPT, CoPilot, etc.) to quickly get work done correctly.
Reproducible Research: Software development practices like version control with git, mercurial, or jujutsu, can be repurposed to keep track of your papers and your research work (and to back them up in the cloud). Coupled with good documentation, these enable reproducible science, both by others, and by your future self.
Thinking like a programmer can help you approach problems differently and better use your tools. Many repetitive or complex tasks can be solved by simple programming – E.g., using regular expressions in your editor to do a complicated search and replace, using Makefiles to programmatically download and install software you need, using LaTeX to write your thesis, having it automatically collect, organize, and format your references.
Programming Examples#
Here are a few examples of the types of problems we will solve in this course. Each of
these examples is code-complete, using only NumPy and SciPy to compute the
solution, and Matplotlib to plot (with a custom function
FPS() to help make movies). By the end of the course, you
should be able to write similar code within a matter of hours to study problems relevant
to your research.
Molecular Dynamics#
Complex material properties can be simply modeled with molecular dynamics simulations that simply apply Newton’s laws to a collection of particles. Here we show a simple example of a projectile impacting a slab. See Molecular Dynamics for more details.
This idea can be easily generalized for your research: e.g. orbiting planets, galaxy formation, traffic flow, protein folding, and biomechanics.
Shallow-Water Equations#
Here is another complete example of one of the core applications that will be taught in the course: solving the Navier-Stokes equations in 1D for incompressible fluid flowing over a rough surface. In this example, we excite a fluid in a harmonic trap with a bump. As the fluid flows back and forth over the bump, viscosity dissipates energy and the sloshing slows down.
This idea can be easily generalized for your research: e.g. economic models, quantum mechanics, and biological response.
Details for those interested: The Shallow-Water Equations
The relevant equations follow from Newton’s law \(F=ma\), where the acceleration \(a = \ddot{X}\) of a particle at position \(x\) and time \(t\) is due to the external potential \(V(x)\), an internal energy \(gn(x,t)\) related to the hydrostatic pressure, and a viscous damping term with coefficient \(\nu\):
This corresponds to an Eulerian hydrodynamic description of a compressible fluid with particle-density \(n(x, t)\) and velocity \(u(x, t)\) through the following partial-differential equations (PDEs)
that can be solved by the method of lines as a system of ODEs integrating forward in time.
We will start with a simple implementation using finite-differences and Euler’s method to get an intuition for what is happening and to check our results:
but then refine our solution using pseudo-spectral methods for derivatives and adaptive integration to get high-precision solutions. This latter solution is completed coded below to give you a flavour of the type of code that will be developed in the course.
There is some physics needed to describe details about this problem, like how the density \(n(x, t)\) relates to the positions of the particles \(X(t)\) in the first description, and how the internal energy \(gn(x, t)\) is related to the hydrostatic pressure. For details, see my Viscosity Notes 3 and Viscosity Notes 4.
The code here demonstrates the use of a pseudo-spectral Chebyshev basis to solve the shallow-water equations for an oscillating fluid. This code is complete, using only NumPy and SciPy libraries, except for plotting, which uses Matplotlib and a custom function FPS to facilitate making movies.
Not included here, but part of the class, will be supporting code for testing, additional documentation, and better organization to facilitate reuse.
For another example, see Model Fitting.
Learning Outcomes#
In addition to the Specific Skills listed in the Syllabus, by the end of the course, students should be able to:
Produce publication quality figures and animations (e.g., 2D and 3D plots, animations, tables, etc.) following best practices of visualization of quantitative data (i.e., as espoused by E. Tufte.)
Produce a manuscript meeting all formatting requirements for a paper in their field of focus, or for a thesis. (E.g., using a suitable LaTeX template and conforming to the submission requirements for mathematics, figures, citations, typesetting, etc.)
Work with data in a variety of formats for analysis, sharing, and reproducible science.
Use version control system to managing code or documents, and establish a portfolio of code on sites like GitLab or GitHub with testing via continuous integration (CI) and documentation.
Profile and optimize their code.
Use a code editor that supports regexp search-and-replace, linting, checking, etc. (E.g., Emacs, Vi, VS Code).
Numerically compute integrals and derivatives with quantified error estimates.
Use numerical linear algebra libraries to diagonalize and factorize matrices: QR, SVD, Cholesky.
Generate pseudo-random numbers with any desired distribution.
Use MCMC techniques to characterize and quantify Bayesian posteriors for fitting and model selection.
Use local optimization techniques (gradient descent, Nelder-Meade, BFGS, Broyden) with and without gradients.
Use techniques like Runge-Kutta to solve initial value problems.
Derive scaling results to validate these techniques.
Notes#
This is a live working document hosted on Read The Docs that will be used to collect and display additional information about the course, including:
Resources, Readings, and References and various class notes. These should also be available through the navigation menu on the left (which might hidden if your display is not sufficiently wide).
These documents are built using JupyterBook (see JupyterBook Demonstration) and include all of the source code needed to generate the figure, plots etc. For example, to see how a figure was made, look in the preceding code cell. The complete source code for this documentation is available at wsu-courses/physics-581-physics-inspired-computation.
Funding Statement#
Some of the material presented here is based upon work supported by the National Science Foundation under Grant Number 2309322. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.