Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • sinfo shows existing queues
  • scontrol show job <JobID> shows information about specific job
  • sstat <JobID> shows resources used by a specific job
  • squeue shows information about queues and used nodes
  • smap curses-graphic of queues and nodes
  • sbatch <script> submits a batch job
  • salloc <resources> requests access to compute nodes for interactive use
  • scancel <JobID> cancels a batch job
  • srun <ressources> <executable> starts a (parallel) code
  • sshare and sprio give information on fair share value and job priority

Usage of local storage on compute nodes

The compute nodes all have a local storage that is mounted at /tmp, respectively. Additionally, the GPU nodes have storage mounted at /scratch. See System overview for the exact sizes.

The access to these local storage is much faster than to $WORK, $HOME or $SCRATCH, but each node only "sees" their own storage (and not the storage of the job's other nodes).

If your job does lots of reading/writing, using the local disk might be beneficial. To do so, you have to copy data to / from the compute nodes. The /tmp folders are deleted once a day.

Code Block
languagebash
# Copy data to the node, where your main MPI (rank 0) task runs
cp $MYDATA /tmp/myfolder

# If you need this data on every node, you have to add `srun` in front of the copy command
srun cp $MYDATA /tmp/myfolder

The same applies for the opposite direction (copying results to the global filesystem).

If your job produces many small files, please consider packing those files into an archive (i.e. tar -czvf file.tar.gz <input data>) before moving them to $WORK or $SCRATCH.


Example Scripts

Job arrays

...