Physics 581: Physics Inspired Computational Techniques#
This is the main project for the WSU Physics course Physics 518: Physics Inspired Computational Techniques first offered in Fall 2021.
Physics has a successful track record of providing effective solutions to complex problems outside its specific domain. This course will focus on using efficient numerical techniques inspired by physics to solve challenging problems in a wide variety of applications. Techniques will be chosen from physics applications, but also applied to problems outside of the physics domain including economics, biology, sociology, etc. Students will be introduced to powerful numerical toolkits based on the SciPy and NumFocus ecosystem. Using the CoCalc platform will enable rapid development and prototyping with an explicit path to stable, tested, and performant codes capable of supporting research, or industry applications.
TL;DR#
To use this repository: (See Getting Started for more details.)
(Optional)
Create accounts on CoCalc and GitLab, a project on CoCalc, and a repo on GitLab. Send your GitLab account name to your instructor.
Create Create SSH keys, and add them to your CoCalc account and to your GitLab account.
SSH into your CoCalc project for the remaining steps.
Clone this repo and initialize it:
git clone https://gitlab.com/wsu-courses/physics-581-physics-inspired-computation.git cd physics-581-physics-inspired-computation make init
This will create a Conda environment you can activate
conda activate envs/default
, and a Jupyter kernel calledphys-581
that you can select from notebooks.(Optional)
(Optional)
Add an appropriate Git Username etc. by defining
LC_GIT_USERNAME
etc. in your project or send these in your~/.ssh/config
file. In your project you can add something like this in your project Settings > Custom environmental variables section:{ "LC_GIT_USERNAME": "Your Full Name", "LC_GIT_USEREMAIL": "your.name@example.com", "LC_HG_USERNAME": "Your Full Name <your.name@example.com>", "LC_EDITOR": "vi", }
Overview#
One of the things I would like to do is to present each technique from at least two perspectives:
- From scratch
I feel it is extremely valuable to understand what is at the core of an algorithm, and to be able to quickly implement a simplistic “brute force” version. These have the following advantage:
Predictable convergence properties. While often not as accurate or fast as more specialized or adaptive routines, simple brute-force versions of an algorithm can often be understood completely in terms of convergence properties and/or stability with respect to round-off errors etc. I highly recommend always checking your results with a simple brute-force approach to make sure you are not making any mistakes:
Occasionally one will need to implement an algorithm from scratch on a specialize platform. For example, to efficiently port code to a GPUs, one must try to remove conditionals (
if
statements). This makes high-precision adaptive routines very hard to program, and one can often gets better performance from a more straight-forward brute-force approach.Related: sometimes high precision is not needed. In these cases, a simple low-accuracy but highly optimized brute-force approach might be fastest. (This is common in video graphics for example where results need only be accurate to the pixel level.)
Having a basic understanding what is happening under “under the hood” of library routines can help when those routines fail.
- As fast as possible
When you need accurate results for research, however, it is also very useful to be familiar with library techniques so you can quickly get high-precision results.
If the appropriate technique is implement in a well-tested library, then one can often use it in a few lines of code and implement it in a few minutes of time – most of which is spent understanding the arguments. This allows you to get results quickly.
These techniques tend to be adaptive, spending more time where the problem is hard to achieve desired tolerance objectives. If you need performance, you can reduce the tolerance.
Unfortunately, adaptive routines can skip over the hard stuff, giving incorrect results that may seem reasonable at first. There is no substitute for understanding your problem.
It can be hard to understand exactly what black-box library routines are doing, and hence hard to understand their convergence properties. It is essential to check your results with different techniques, or, at a minimum, with different resolutions.
Definition 16 (IDIOT) Anyone who publishes a calculation without checking it against an identical computation with smaller \(N\) OR without evaluating the residual of the pseudospectral approximation via finite differences is an IDIOT. (J. P. Boyd: Chebyshev and Fourier Spectral Methods [Boyd, 1989])
Software Carpentry#
Another objective of the course is to provide students with good software carpentry skills. This includes using version control, documenting code, testing, code-coverage, and continuous integration (CI). In particular, students are expected to fork this repository and maintain their own GitLab repository with fully tested code. Assignments will be distributed in the form of tests which the students must provide functions which pass these tests.
Students will be expected to maintain their code under version control.
Code must be tested with unit tests providing at least 85% code coverage.
Code must meet certain quality metrics, including documenting behaviour, inputs/outputs, specifying interfaces etc.
As part of the course, I will provide a detailed explanation of how to use tools like pytest, Coverage.py, Flake8 and new tools like LGTM that provide security analyses of code for Python-based projects to satisfying all of these objectives. With continuous integration techniques, these tests can be run whenever code is committed, helping maintain functioning, well-tested code.
This repository provides a skeleton satisfying these requirements, and demonstrating how to write proper tests, use GitLabs continuous integration, and to generate documentation for Python. (Students wishing to use other languages will need to learn how to use similar tools on their own.)
- Justification
Thinking about how to test code can significantly help in understanding the techniques and the problems. A significant portion of the course will address this issue. For example: How can one find non-trivial problems with analytic solutions for testing? What if such problems cannot be found? How should the algorithms converge? Is appropriate convergence being achieved? (Answering this later question quantitatively can provide for very useful test cases.)
These skills will definitely be of benefit to anyone looking later for a career in industry, but will also help in maintaining code in a research setting.
The repository of code developed for this course can serve as a future portfolio.
Working tests serve as a demonstration of how to use the code, thereby functioning in some sense as documentation examples that are checked.
GitLab Fork#
Create an account on GitLab.
Fork the Official Course Repository (I suggest making this private since your grade is associated with the tests, but you are welcome to make it public whenever you are comfortable.)
Add your instructor
@mforbes
as a Developer for the project:Project Information > Members
Clone this to your CoCalc project and/or your computer. Do your work etc. and push your changes.
Trigger the CI pipeline if it was not triggered by your push.
CI/CD > Pipelines > Run pipeline
Add the badges (I don’t know how to automate this or store this in a file yet… could maybe use the Badges API):
Settings > General > Badges
The following list the required fields:
Name Link Badge image URL
Docs https://wsu-phys-581-fall-2021.readthedocs.io/en/latest/?badge=latest https://readthedocs.org/projects/wsu-phys-581-fall-2021/badge/?version=latest
Pipeline https://gitlab.com/%{project_path} https://gitlab.com/%{project_path}/badges/%{default_branch}/pipeline.svg
Tests https://gitlab.com/%{project_path} https://gitlab.com/%{project_path}/-/jobs/artifacts/%{default_branch}/raw/_artifacts/test-badge.svg?job=test
Coverage https://gitlab.com/%{project_path} https://gitlab.com/%{project_path}/-/jobs/artifacts/%{default_branch}/raw/_artifacts/coverage-badge.svg?job=test
Assignment-0 https://gitlab.com/%{project_path} https://gitlab.com/%{project_path}/-/jobs/artifacts/%{default_branch}/raw/_artifacts/test-0-badge.svg?job=test-0
Assignment-1 https://gitlab.com/%{project_path} https://gitlab.com/%{project_path}/-/jobs/artifacts/%{default_branch}/raw/_artifacts/test-1-badge.svg?job=test-1
etc.
Optional: SSH Keys#
Typing your password every time you want to pull or push quickly gets tiring. A better option is to use [SSH][] to authenticate, connect, and to forward your agent so you don’t need to re-authenticate. The basic ideas are explained in connecting to CoCalc with SSH.
Optional: GitHub Mirror#
You can create a mirror on GitHub of your GitLab project which is updated whenever
you commit to your main
branch. Maintaining a GitHub mirror like this allows you to
use the GitHub CI tools, which differ somewhat from those on GitLab.
(Optional) Create an account on GitHub.
References#
Python Tutorial: This is the definative tutorial for the python language. If you have not read this and plan to use python, then you should.
NumPy Tutorial: Growing repository of tutorials for using NumPy. Being able to “think” in terms of arrays (vectorization) can greatly simplify your understanding of algorithms, while simultaneously improving your code, both from a performance and a reliability standpoint. Not every problem benefits from this approach, but many of those in physics do. (We should try to contribute to these.)
Hypermodern Python: Deals with issues about packaging, testing, etc. I plan to follow this (with some modifications discussed in Hypothes.is annotations to setup the coding framework.
Maintainer Notes#
Try to keep this upper-level project as clean as possible, matching the layout expected for the students. This will be turned into a skeleton at some point.
Tools#
Anaconda Project#
anaconda-project init
anaconda-project add-packages python=3.9 scipy matplotlib sphinx
anaconda-project add-packages conda-forge::uncertainties
anaconda-project add-packages conda-forge::sphinx-panels conda-forge::sphinx-book-theme conda-forge::myst-nb
anaconda-project add-packages --pip sphinxcontrib-zopeext sphinxcontrib-bibtex mmf-setup
To clean:
anaconda-project clean
Repository Setup#
Can use GitHub, GitLab, or Heptapod. With automatic pushing to GitHub, one and run the following CI’s:
LGTM.com
One course repo. Students clone their own fork and pull changes. Assignments distributed as tests.
How to grade? Student’s can keep projects private (but probably will not have access to badges.) Run tests on Student’s CoCalc servers or with CI?
Best Practices#
Use Jupytext and version control the associated python files. Only commit the full notebooks (with output) when you want to archive documentation.
Maybe do this on an “output” branch or something so the main repo does not get cluttered?
Docs#
To build the documents interactively:
make doc-server
This will run sphinx-autobuild
which will launch a webserver on http://127.0.0.1:8000 and rebuild the docs whenever you
save a change.
Here is the play-by-play for setting up the documentation.
cd Docs
sphinx-quickstart
wget https://brand.wsu.edu/wp-content/themes/brand/images/pages/logos/wsu-signature-vertical.svg -O _static/wsu-logo.svg
cp -r ../envs/default/lib/python3.9/site-packages/sphinx_book_theme/_templates/* _templates
I then edited the conf.py
hg add local.bib _static/ _templates/
CoCalc Setup#
Purchase a license with 2 projects to allow the course and WSU Courses CoCalc project and Shared CoCalc Project to run. This approach requires the students to pay $14 for access four the term (4 months). They can optionally use any license they already have instead.
Optionally, one might opt to purchase a license for \(n+2\) projects where \(n\) is the number of students, if there is central funding available. See Course Upgrading Students for more details.
Next, create a course. I do this in my WSU Courses CoCalc project called 581-2021.
Open this course and create the Shared CoCalc Project. Activate the license for this project so that it can run. I then add the SSH key to may
.ssh/config
files so I can quickly login.Clone the repos into the shared project and initialize the project. Optional, but highly recommend – use my
mmf-setup
project to provide some useful featuresssh smc581shared # My alias in .ssh/config python3 -m pip install mmf_setup mmf_setup cocalc
This provides some instructions on how to use the CoCalc configuration. The most important is to forward your user agent and set your
hg
andgit
usernames:~$ mmf_setup cocalc ... If you use version control, then to get the most of the configuration, please make sure that you set the following variables on your personal computer, and forward them when you ssh to the project: # ~/.bashrc or similar LC_HG_USERNAME=Your Full Name <your.email.address+hg@gmail.com> LC_GIT_USEREMAIL=your.email.address+git@gmail.com LC_GIT_USERNAME=Your Full Name To forward these, your SSH config file (~/.ssh/config) might look like: # ~/.ssh/config Host cc-project1 User ff1cb986f... Host cc* HostName ssh.cocalc.com ForwardAgent yes SendEnv LC_HG_USERNAME SendEnv LC_GIT_USERNAME SendEnv LC_GIT_USEREMAIL SetEnv LC_EDITOR=vi
Logout and log back in so we have the forwarded credentials, and now clone the repos.
git clone git@gitlab.com:wsu-courses/physics-581-physics-inspired-computation.git cd physics-581-physics-inspired-computation make
The last step runs
git clone git@gitlab.com:wsu-courses/physics-581-physics-inspired-computation_resources.git _ext/Resources
which puts the resources folder in_ext/Resources
.Create an environment:
ssh smc581shared cd physics-581-physics-inspired-computation anaconda2020 anaconda-project prepare conda activate envs/default python -m ipykernel install --user --name "PHYS-581" --display-name "Python 3 (PHYS-581)"
This will create a Conda environment as specified in
anaconda-project.yml
inenvs/default
.
Funding#
Some of the material presented here is based upon work supported by the National Science Foundation under Grant Number 1707691. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.