Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

We provide Python modules with only basic Python installed for currently supported Python versions 3.7 through 3.10. Further packages you might need on top of this can be installed via standard methods, e.g. pip.

Additionally, we provide a toolbox of common data analysis tools in both Python and R. This module is available under analysis_toolbox and is updated every 3 months. This module is actually a conda environment which simply sets the correct shell variables (e.g. $PATH, $PYTHONPATH, and similar) for you. A list of currently installed Python and R tools in this module may be found under /albedo/soft/conda-workspace/env-ymls/analysis-toolbox-03.2023.yml.

If you would like additional libraries in this globally available analysis toolbox, just ask! They will be added whenever the next one is released (January, March, June, and September)

R

Similar to the Python, the some basic R modules are also conda environments. The module r/4.2 also includes r-studio, which you can open from the terminal and then use via X.included in the analysis_toolbox. Currently, there is no R-Studio available but we are planning to install it in the September-2023 update of the analysis_toolbox. Be aware that we still recommend against using graphical interfaces since that's not what an HPC is designed for. Our recommended workflow for using R is either with the interactive session and the R IDLE, or via Rscript and sbatch script:

Interactive session with R IDLE

From Albedo's login node:

$ salloc --account=<your_account> --time=<HH:MM:SS> --qos=<QOS> --nodes=<#Nodes> <other_options...>

To understand which options you need to specify please refer to the Jobs section on Albedo-Slurm and the SLURM user guide.

$ module load analysis_toolbox$ R

The R IDLE will open and you can start using R from there.

Rscript and sbatch script

From Albedo's login node:

$ module load analysis_toolbox

The command Rscript allows you to run an R script you wrote from the shell (outside of R IDLE or R-Studio). This means that you can also write a sbatch script (see Albedo-Slurm > Jobs) that runs your R scripts via Rscript and submit it to the slurm queue using sbatch command. You can even use the slurm array feature to lunch the same R script multiple times in parallel if what your script does can be broken into multiple computing chunks. Depending on your R script it might need more or less changes, but it's probably worth to spend the time on chaning it to be able to benefit from this first order parallelization. For a simple example on slurm array + Rscript the following tutorial covers most of it: https://rcpedia.stanford.edu/topicGuides/jobArrayRExample.html

Conda

Conda is a package manager for Python, R, and Julia software. You can use it on our HPC system by:

...