We maintain several versions of Python on the Shared Computing Cluster (SCC) and try to keep both the Python versions and associated packages up to date. Each Python installation has many packages pre-configured and readily available. Researchers have a choice of Python version on SCC, but most will prefer the general purpose installations. Each installation can be loaded using the module system.
The categories of our Python installations are as follows:
The most recent listing of python installations on SCC can be queried using the module system.
scc1$ module avail python
General Purpose Python Modules
For most computational use cases, the most recent version of these general purpose installations is the best choice. We offer python2, python3, and maintain older versions for reproducibility. These modules are configured for general use and already contain many commonly used python packages.
General Purpose Modules:
Module Name | Description |
---|---|
python3/3.8.3 |
Python 3.8.3 (May 2020) |
python3/3.7.7 |
Python 3.7.7 (March 2020) — default python3 |
python3/3.7.5 |
Python 3.7.5 (October 2019) |
python3/3.7.3 |
Python 3.7.3 (March 2019) |
python3/3.6.10 |
Python 3.6.10 (December 2019) |
python3/3.6.9 |
Python 3.6.9 (July 2019) |
python3/3.6.5 |
Python 3.6.5 (March 2018) |
python2/2.7.16 |
Python 2.7.16 (March 2019) — default python2 |
To load the most recent version of python3, issue the following command:
scc1$ module load python3
The default version of Python3 is currently the python3/3.7.7 module. Python 3.7 currently has broader support and compatibility with numerous applications and libraries, Python or otherwise, compared with Python 3.8.
Update Frequency:
We install new Python modules approximately every 6 months. Currently the default Python3 version will remain in the 3.7.x series but this may change at some future point. The Python libraries installed within the Python modules are updated to their latest versions when the module is installed.
Intel Distribution for Python
The Intel Distribution for Python behaves like regular Python, but leverages Intel technologies to speed up many of the core python libraries, including NumPy, SciPy, pandas, scikit-learn, Jupyter, matplotlib, and mpi4py. This distribution also integrates Intel Math Kernel Library (Intel MKL), Intel Data Analytics Acceleration Library (DAAL) and pyDAAL, Intel MPI Library, and Intel Threading Building Blocks (TBB). The following modules offer significant speedups for some computational workloads at the cost of potential incompatibility with other python packages.
Intel Python Modules:
Module Name | Description |
---|---|
python3-intel/2020.4.912 |
Intel Python 3.7 2020 Update 4 (October 2020) — Most Recent Intel Python3 |
python3-intel/2020.2.902 |
Intel Python 3.7 2020 Update 2 (July 2020) |
python3-intel/2020.1.893 |
Intel Python 3.7 2020 Update 1 (March 2020) |
python3-intel/2020.0.014 |
Intel Python 3.7 2020 Initial Release (January 2020) |
python3-intel/2019.5.098 |
Intel Python 3.6 2019 Update 5 (September 2019) |
python3-intel/2019.4.088 |
Intel Python 3.6 2019 Update 4 (May 2019) |
python2-intel/2019.5.098 |
Intel Python 2.7 2019 Update 5 (September 2019) — Most Recent Intel Python2 |
python2-intel/2019.4.088 |
Intel Python 2.7 2019 Update 4 (May 2019) |
To load the most recent version of Intel Python 3.7, issue the following command:
scc1$ module load python3-intel/2020.4.912
The optimizations available in the Intel Distribution of Python depend both OpenMP and Intel multi-processing libraries. Two environment variables are used coordinate the automatic parallel processing: OMP_NUM_THREADS
and MKL_NUM_THREADS
. It is important to understand the impact of these variables on your code and define appropriate values when running your code. Most commonly, you should set the value equal to the number of slots ($NSLOTS
) your job requests, but not always. Please read Intel’s guidance on threaded applications.
Module Name | Description | Default Value | Recommended |
---|---|---|---|
OMP_NUM_THREADS |
Sets the number of threads for OpenMP | 1 | $NSLOTS |
MKL_NUM_THREADS |
Sets the number of threads for Intel Math Kernel Library | 1 | $NSLOTS |
OPENBLAS_NUM_THREADS |
Sets the number of threads for OpenBLAS | 1 | $NSLOTS |
NUMBA_NUM_THREADS |
Sets the number of threads for NUMBA | 1 | $NSLOTS |
Anaconda Distribution (conda)
Anaconda is an open-source package and environment manager for Python. It has gained traction for ease of packaging and replicating modules or entire python environments on different systems. The distribution includes a set of core python packages and additional user packages can be installed from remote “channels.” Anaconda has also been known to cause confusion and package dependency issues in complex environments; virtualenv is recommended for environment configuration within the cluster if you are not working with existing software that depends on a conda environment. We recommend the use of the miniconda/4.7.5 module for Anaconda. The anaconda2 and anaconda3 modules are not updated.
Anaconda Distribution Modules for Python
Module Name | Description |
---|---|
miniconda/4.7.5 |
Miniconda provides the conda tool, allows for any Anaconda distribution of Python to be set up by the user. |
anaconda3/5.2.0 |
Anaconda distribution of python 3.6 (May 2018) – not recommended |
anaconda2/5.2.0 |
Anaconda distribution of python 2.7 (May 2018) – not recommended |
Instructions for using Anaconda are found on the Anaconda Python page.