We maintain several versions of Python on the Shared Computing Cluster (SCC) and try to keep both the Python versions and associated packages up to date. Each Python installation has many packages pre-configured and readily available. Researchers have a choice of Python version on SCC, but most will prefer the general purpose installations. Each installation can be loaded using the module system.

The categories of our Python installations are as follows:

 

The most recent listing of python installations on SCC can be queried using the module system.

scc1$ module avail python

General Purpose Python Modules

For most computational use cases, the most recent version of these general purpose installations is the best choice. We offer python2, python3, and maintain older versions for reproducibility. These modules are configured for general use and already contain many commonly used python packages.

General Purpose Modules:

Module Name Description
python3/3.8.3 Python 3.8.3 (May 2020)
python3/3.7.7 Python 3.7.7 (March 2020) — default python3
python3/3.7.5 Python 3.7.5 (October 2019)
python3/3.7.3 Python 3.7.3 (March 2019)
python3/3.6.10 Python 3.6.10 (December 2019)
python3/3.6.9 Python 3.6.9 (July 2019)
python3/3.6.5 Python 3.6.5 (March 2018)
python2/2.7.16 Python 2.7.16 (March 2019)  — default python2

To load the most recent version of python3, issue the following command:

scc1$ module load python3

The default version of Python3 is currently the python3/3.7.7 module.  Python 3.7 currently has broader support and compatibility with numerous applications and libraries, Python or otherwise, compared with Python 3.8.

 

Update Frequency:

We install new Python modules approximately every 6 months. Currently the default Python3 version will remain in the 3.7.x series but this may change at some future point. The Python libraries installed within the Python modules are updated to their latest versions when the module is installed.

Intel Distribution for Python

The Intel Distribution for Python behaves like regular Python, but leverages Intel technologies to speed up many of the core python libraries, including NumPy, SciPy, pandas, scikit-learn, Jupyter, matplotlib, and mpi4py. This distribution also integrates Intel Math Kernel Library (Intel MKL), Intel Data Analytics Acceleration Library (DAAL) and pyDAAL, Intel MPI Library, and Intel Threading Building Blocks (TBB). The following modules offer significant speedups for some computational workloads at the cost of potential incompatibility with other python packages.

Intel Python Modules:

Module Name Description
python3-intel/2020.4.912 Intel Python 3.7 2020 Update 4 (October 2020)  — Most Recent Intel Python3
python3-intel/2020.2.902 Intel Python 3.7 2020 Update 2 (July 2020)
python3-intel/2020.1.893 Intel Python 3.7 2020 Update 1 (March 2020)
python3-intel/2020.0.014 Intel Python 3.7 2020 Initial Release (January 2020)
python3-intel/2019.5.098 Intel Python 3.6 2019 Update 5 (September 2019)
python3-intel/2019.4.088 Intel Python 3.6 2019 Update 4 (May 2019)
python2-intel/2019.5.098 Intel Python 2.7 2019 Update 5 (September 2019) — Most Recent Intel Python2
python2-intel/2019.4.088 Intel Python 2.7 2019 Update 4 (May 2019)

To load the most recent version of Intel Python 3.7, issue the following command:

scc1$ module load python3-intel/2020.4.912

The optimizations available in the Intel Distribution of Python depend both OpenMP and Intel multi-processing libraries. Two environment variables are used coordinate the automatic parallel processing: OMP_NUM_THREADS and MKL_NUM_THREADS. It is important to understand the impact of these variables on your code and define appropriate values when running your code. Most commonly, you should set the value equal to the number of slots ($NSLOTS) your job requests, but not always. Please read Intel’s guidance on threaded applications.

 

Module Name Description Default Value Recommended
OMP_NUM_THREADS Sets the number of threads for OpenMP 1 $NSLOTS
MKL_NUM_THREADS Sets the number of threads for Intel Math Kernel Library 1 $NSLOTS
OPENBLAS_NUM_THREADS Sets the number of threads for OpenBLAS 1 $NSLOTS
NUMBA_NUM_THREADS Sets the number of threads for NUMBA 1 $NSLOTS

 

Anaconda Distribution (conda)

Anaconda is an open-source package and environment manager for Python. It has gained traction for ease of packaging and replicating modules or entire python environments on different systems. The distribution includes a set of core python packages and additional user packages can be installed from remote “channels.” Anaconda has also been known to cause confusion and package dependency issues in complex environments; virtualenv is recommended for environment configuration within the cluster if you are not working with existing software that depends on a conda environment. We recommend the use of the miniconda/4.7.5 module for Anaconda. The anaconda2 and anaconda3 modules are not updated.

Anaconda Distribution Modules for Python

Module Name Description
miniconda/4.7.5 Miniconda provides the conda tool, allows for any Anaconda distribution of Python to be set up by the user.
anaconda3/5.2.0 Anaconda distribution of python 3.6 (May 2018) – not recommended
anaconda2/5.2.0 Anaconda distribution of python 2.7 (May 2018) – not recommended

Instructions for using Anaconda are found on the Anaconda Python page.