CoCalc
Contents
CoCalc¶
Features¶
CoCalc provides an online platform for performing computations without the need to install any software on your computer. It allows you to immediately start exploring in a variety of languages, which several unique features. The following are particularly relevant to this course:
- Collaborative Editing
CoCalc is one of the only platform I know of at the time that allows simultaneous editing of Jupyter notebooks. This means that the instructors can directly connect to student’s projects, seeing the exact same code, and interactively debugging it. This is done with a custom implementation of the notebook server, with the downside that not all features of Jupyter notebooks are supported yet. If you need to, you can launch a Plain Jupyter Server and JupyterLab Server to regain all functionality (but will lose the collaborative editing ability while you do so).
- Extensive Software Preinstalled
CoCalc comes with a large amount of useful software, including a rather complete
Anacondaenvironment. This allows you to immediately start working. Simply create a new Jupyter notebook, choose theAnaconda2020kernel, and start coding with the full SciPy software stack at your disposal. (Once you get things working, I strongly advocate migrating your code to a well tested repository like the one described here, but don’t let this stop you from exploring.)- VS Code Editor:
CoCalc now supports editing files in your browser with VS Code. While I personally use Emacs, VS Code seems to be a very good tool for beginners. I strongly recommend that you learn a good editor with powerful search and replace features, syntax highlighting and language support: it will ultimately save you lots of time. (Note: this is a fairly new feature, however, and I have not explored it much, but it looks good.)
I find a couple of other features important:
- Open Source
CoCalc itself is open source and can be installed from a Docker image. This means that you can run CoCalc on your own hardware, with complete control of your data, even if they go out of business. The make their profits by selling their service. I completely support this type of business model which puts you in control of your data.
- Time Travel
CoCalc implements an amazing backup system they call Time Travel that allows you to roll back almost any file minutes, hours, days, weeks, or more. I am blown away by how well this feature is implemented: it has saved me several times and along is worth the license costs.
- Responsive Support
The CoCalc company is small enough that they can still be responsive to feature and support requests. When I have issues, the often make changes within an hour, and virtually never take more than a day. This is in stark contrast to large companies where you submit a request to their community forums only to have it ignore for years. Of course, the team being small means that they do not have the resources to implement everything, but can be motivated by money if you really need something done. Nevertheless, they have always taken care of any core issues I have found promptly, and are really nice people too!
- Remote File Systems
You can mount remote file systems with
sshfs. This allows you to use CoCalc as a tool to analyze off-site data (although performance will be slow because the data needs to be transferred over the network).
For a more completed exploration, look at the list of features.
Setup¶
While one of the main benefits of CoCalc is that you can just fire it up and get to
work, for the purposes of this course, establishing a reproducible computing environment
is important. After exploring several tools, I have landed on anaconda-project
which allows you to manage a Conda environment in a somewhat reasonable way.
Using this effectively on CoCalc is a bit challenging out of the box because the
default anaconda2020 environment they have setup has a /ext/anaconda2020.02/.condarc
file with a whole slew of channels – so many in fact that even a simple conda search uncertainties almost runs out of memory. One option is to use mamba which can be
done by setting CONDA_EXE=mamba.
Another potential option is to use a custom miniconda environment, but even this takes
too much memory. Until anaconda-project has a way of ignored the
channels, it seems
like the best option is to simply install our own version of Miniconda and use this as a
base:
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh -qO miniconda.sh
echo "1ea2f885b4dbc3098662845560bc64271eb17085387a70c2ba3f29fff6f8d52f miniconda.sh" > miniconda.shasum
shasum -a 256 -c miniconda.shasum && bash miniconda.sh -b -p ~/.miniconda
rm miniconda.sh*
. ~/.miniconda/bin/activate
conda install anaconda-project
conda clean --all -y
du -sh ~/.miniconda # 136M /home/user/.miniconda
echo "export COCALC_MINICONDA=~/.miniconda" >> ~/.bashrc