Physics 581: Physics Inspired Computational Techniques#

Documentation Status pipeline status coverage report

This is the main project for the WSU Physics course Physics 518: Physics Inspired Computational Techniques first offered in Fall 2021.

Physics has a successful track record of providing effective solutions to complex problems outside its specific domain. This course will focus on using efficient numerical techniques inspired by physics to solve challenging problems in a wide variety of applications. Techniques will be chosen from physics applications, but also applied to problems outside of the physics domain including economics, biology, sociology, etc. Students will be introduced to powerful numerical toolkits based on the SciPy and NumFocus ecosystem. Using the CoCalc platform will enable rapid development and prototyping with an explicit path to stable, tested, and performant codes capable of supporting research, or industry applications.

TL;DR#

To use this repository: (See Getting Started for more details.)

  1. (Optional)

  2. Clone this repo and initialize it:

    git clone https://gitlab.com/wsu-courses/physics-581-physics-inspired-computation.git
    cd physics-581-physics-inspired-computation
    make init
    

    This will create a Conda environment you can activate conda activate envs/default, and a Jupyter kernel called phys-581 that you can select from notebooks.

  3. (Optional)

    • Add your GitLab repo as a remote, and push.

    • Create a GitHub mirror with automatic push from your GitLab repo so you can take advantage of their CI.

  4. (Optional)

    • Add an appropriate Git Username etc. by defining LC_GIT_USERNAME etc. in your project or send these in your ~/.ssh/config file. In your project you can add something like this in your project Settings > Custom environmental variables section:

      {
        "LC_GIT_USERNAME": "Your Full Name", 
        "LC_GIT_USEREMAIL": "your.name@example.com",
        "LC_HG_USERNAME": "Your Full Name <your.name@example.com>",
        "LC_EDITOR": "vi",
      }
      

Overview#

One of the things I would like to do is to present each technique from at least two perspectives:

From scratch

I feel it is extremely valuable to understand what is at the core of an algorithm, and to be able to quickly implement a simplistic “brute force” version. These have the following advantage:

  • Predictable convergence properties. While often not as accurate or fast as more specialized or adaptive routines, simple brute-force versions of an algorithm can often be understood completely in terms of convergence properties and/or stability with respect to round-off errors etc. I highly recommend always checking your results with a simple brute-force approach to make sure you are not making any mistakes:

  • Occasionally one will need to implement an algorithm from scratch on a specialize platform. For example, to efficiently port code to a GPUs, one must try to remove conditionals (if statements). This makes high-precision adaptive routines very hard to program, and one can often gets better performance from a more straight-forward brute-force approach.

  • Related: sometimes high precision is not needed. In these cases, a simple low-accuracy but highly optimized brute-force approach might be fastest. (This is common in video graphics for example where results need only be accurate to the pixel level.)

  • Having a basic understanding what is happening under “under the hood” of library routines can help when those routines fail.

As fast as possible

When you need accurate results for research, however, it is also very useful to be familiar with library techniques so you can quickly get high-precision results.

  • If the appropriate technique is implement in a well-tested library, then one can often use it in a few lines of code and implement it in a few minutes of time – most of which is spent understanding the arguments. This allows you to get results quickly.

  • These techniques tend to be adaptive, spending more time where the problem is hard to achieve desired tolerance objectives. If you need performance, you can reduce the tolerance.

  • Unfortunately, adaptive routines can skip over the hard stuff, giving incorrect results that may seem reasonable at first. There is no substitute for understanding your problem.

  • It can be hard to understand exactly what black-box library routines are doing, and hence hard to understand their convergence properties. It is essential to check your results with different techniques, or, at a minimum, with different resolutions.

    Definition 16 (IDIOT) Anyone who publishes a calculation without checking it against an identical computation with smaller \(N\) OR without evaluating the residual of the pseudospectral approximation via finite differences is an IDIOT. (J. P. Boyd: Chebyshev and Fourier Spectral Methods [Boyd, 1989])

Software Carpentry#

Another objective of the course is to provide students with good software carpentry skills. This includes using version control, documenting code, testing, code-coverage, and continuous integration (CI). In particular, students are expected to fork this repository and maintain their own GitLab repository with fully tested code. Assignments will be distributed in the form of tests which the students must provide functions which pass these tests.

  • Students will be expected to maintain their code under version control.

  • Code must be tested with unit tests providing at least 85% code coverage.

  • Code must meet certain quality metrics, including documenting behaviour, inputs/outputs, specifying interfaces etc.

As part of the course, I will provide a detailed explanation of how to use tools like pytest, Coverage.py, Flake8 and new tools like LGTM that provide security analyses of code for Python-based projects to satisfying all of these objectives. With continuous integration techniques, these tests can be run whenever code is committed, helping maintain functioning, well-tested code.

This repository provides a skeleton satisfying these requirements, and demonstrating how to write proper tests, use GitLabs continuous integration, and to generate documentation for Python. (Students wishing to use other languages will need to learn how to use similar tools on their own.)

Justification
  • Thinking about how to test code can significantly help in understanding the techniques and the problems. A significant portion of the course will address this issue. For example: How can one find non-trivial problems with analytic solutions for testing? What if such problems cannot be found? How should the algorithms converge? Is appropriate convergence being achieved? (Answering this later question quantitatively can provide for very useful test cases.)

  • These skills will definitely be of benefit to anyone looking later for a career in industry, but will also help in maintaining code in a research setting.

  • The repository of code developed for this course can serve as a future portfolio.

  • Working tests serve as a demonstration of how to use the code, thereby functioning in some sense as documentation examples that are checked.

GitLab Fork#

  1. Create an account on GitLab.

  2. Fork the Official Course Repository (I suggest making this private since your grade is associated with the tests, but you are welcome to make it public whenever you are comfortable.)

  3. Add your instructor @mforbes as a Developer for the project:

    • Project Information > Members

  4. Clone this to your CoCalc project and/or your computer. Do your work etc. and push your changes.

  5. Trigger the CI pipeline if it was not triggered by your push.

    • CI/CD > Pipelines > Run pipeline

  6. Add the badges (I don’t know how to automate this or store this in a file yet… could maybe use the Badges API):

    • Settings > General > Badges

    The following list the required fields:

    Name
    Link
    Badge image URL
    
    Docs
    https://wsu-phys-581-fall-2021.readthedocs.io/en/latest/?badge=latest
    https://readthedocs.org/projects/wsu-phys-581-fall-2021/badge/?version=latest
    
    Pipeline
    https://gitlab.com/%{project_path}
    https://gitlab.com/%{project_path}/badges/%{default_branch}/pipeline.svg
    
    Tests
    https://gitlab.com/%{project_path}
    https://gitlab.com/%{project_path}/-/jobs/artifacts/%{default_branch}/raw/_artifacts/test-badge.svg?job=test
    
    Coverage
    https://gitlab.com/%{project_path}
    https://gitlab.com/%{project_path}/-/jobs/artifacts/%{default_branch}/raw/_artifacts/coverage-badge.svg?job=test
    
    Assignment-0
    https://gitlab.com/%{project_path}
    https://gitlab.com/%{project_path}/-/jobs/artifacts/%{default_branch}/raw/_artifacts/test-0-badge.svg?job=test-0
    
    Assignment-1
    https://gitlab.com/%{project_path}
    https://gitlab.com/%{project_path}/-/jobs/artifacts/%{default_branch}/raw/_artifacts/test-1-badge.svg?job=test-1
    

    etc.

Optional: SSH Keys#

Typing your password every time you want to pull or push quickly gets tiring. A better option is to use [SSH][] to authenticate, connect, and to forward your agent so you don’t need to re-authenticate. The basic ideas are explained in connecting to CoCalc with SSH.

Optional: GitHub Mirror#

You can create a mirror on GitHub of your GitLab project which is updated whenever you commit to your main branch. Maintaining a GitHub mirror like this allows you to use the GitHub CI tools, which differ somewhat from those on GitLab.

  1. (Optional) Create an account on GitHub.

References#

  • Python Tutorial: This is the definative tutorial for the python language. If you have not read this and plan to use python, then you should.

  • NumPy Tutorial: Growing repository of tutorials for using NumPy. Being able to “think” in terms of arrays (vectorization) can greatly simplify your understanding of algorithms, while simultaneously improving your code, both from a performance and a reliability standpoint. Not every problem benefits from this approach, but many of those in physics do. (We should try to contribute to these.)

  • Hypermodern Python: Deals with issues about packaging, testing, etc. I plan to follow this (with some modifications discussed in Hypothes.is annotations to setup the coding framework.

Maintainer Notes#

Try to keep this upper-level project as clean as possible, matching the layout expected for the students. This will be turned into a skeleton at some point.

Tools#

Anaconda Project#

anaconda-project init
anaconda-project add-packages python=3.9 scipy matplotlib sphinx
anaconda-project add-packages conda-forge::uncertainties
anaconda-project add-packages conda-forge::sphinx-panels conda-forge::sphinx-book-theme conda-forge::myst-nb
anaconda-project add-packages --pip sphinxcontrib-zopeext sphinxcontrib-bibtex mmf-setup

To clean:

anaconda-project clean

Repository Setup#

Can use GitHub, GitLab, or Heptapod. With automatic pushing to GitHub, one and run the following CI’s:

  • LGTM.com

  • One course repo. Students clone their own fork and pull changes. Assignments distributed as tests.

  • How to grade? Student’s can keep projects private (but probably will not have access to badges.) Run tests on Student’s CoCalc servers or with CI?

Best Practices#

  • Use Jupytext and version control the associated python files. Only commit the full notebooks (with output) when you want to archive documentation.

    • Maybe do this on an “output” branch or something so the main repo does not get cluttered?

Docs#

To build the documents interactively:

make doc-server

This will run sphinx-autobuild which will launch a webserver on http://127.0.0.1:8000 and rebuild the docs whenever you save a change.

Here is the play-by-play for setting up the documentation.

cd Docs
sphinx-quickstart
wget https://brand.wsu.edu/wp-content/themes/brand/images/pages/logos/wsu-signature-vertical.svg -O _static/wsu-logo.svg 
cp -r ../envs/default/lib/python3.9/site-packages/sphinx_book_theme/_templates/* _templates

I then edited the conf.py

hg add local.bib _static/ _templates/

CoCalc Setup#

  • Purchase a license with 2 projects to allow the course and WSU Courses CoCalc project and Shared CoCalc Project to run. This approach requires the students to pay $14 for access four the term (4 months). They can optionally use any license they already have instead.

    Optionally, one might opt to purchase a license for \(n+2\) projects where \(n\) is the number of students, if there is central funding available. See Course Upgrading Students for more details.

  • Next, create a course. I do this in my WSU Courses CoCalc project called 581-2021.

    Open this course and create the Shared CoCalc Project. Activate the license for this project so that it can run. I then add the SSH key to may .ssh/config files so I can quickly login.

  • Clone the repos into the shared project and initialize the project. Optional, but highly recommend – use my mmf-setup project to provide some useful features

    ssh smc581shared       # My alias in .ssh/config
    python3 -m pip install mmf_setup
    mmf_setup cocalc
    

    This provides some instructions on how to use the CoCalc configuration. The most important is to forward your user agent and set your hg and git usernames:

    ~$ mmf_setup cocalc
    ...
    If you use version control, then to get the most of the configuration,
    please make sure that you set the following variables on your personal
    computer, and forward them when you ssh to the project:
    
        # ~/.bashrc or similar
        LC_HG_USERNAME=Your Full Name <your.email.address+hg@gmail.com>
        LC_GIT_USEREMAIL=your.email.address+git@gmail.com
        LC_GIT_USERNAME=Your Full Name
    
    To forward these, your SSH config file (~/.ssh/config) might look like:
    
        # ~/.ssh/config
        Host cc-project1
          User ff1cb986f...
      
        Host cc*
          HostName ssh.cocalc.com
          ForwardAgent yes
          SendEnv LC_HG_USERNAME
          SendEnv LC_GIT_USERNAME
          SendEnv LC_GIT_USEREMAIL
          SetEnv LC_EDITOR=vi
    

    Logout and log back in so we have the forwarded credentials, and now clone the repos.

    git clone git@gitlab.com:wsu-courses/physics-581-physics-inspired-computation.git
    cd physics-581-physics-inspired-computation
    make
    

    The last step runs git clone git@gitlab.com:wsu-courses/physics-581-physics-inspired-computation_resources.git _ext/Resources which puts the resources folder in _ext/Resources.

  • Create an environment:

    ssh smc581shared
    cd physics-581-physics-inspired-computation
    anaconda2020
    anaconda-project prepare
    conda activate envs/default
    python -m ipykernel install --user --name "PHYS-581" --display-name "Python 3 (PHYS-581)"
    

    This will create a Conda environment as specified in anaconda-project.yml in envs/default.

Funding#

https://www.nsf.gov/images/logos/NSF_4-Color_bitmap_Logo.png

Some of the material presented here is based upon work supported by the National Science Foundation under Grant Number 1707691. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.