...
- sinfo shows existing queues
- scontrol show job <JobID> shows information about specific job
- sstat <JobID> shows resources used by a specific job
- squeue shows information about queues and used nodes
- sbatch <script> submits a batch job
- salloc <resources> requests access to compute nodes for interactive use
- scancel <JobID> cancels a batch job
- srun <ressources> <executable> starts a (parallel) code
- sshare and sprio give information on fair share value and job priority
- sreport -t Percent cluster UserUtilizationByAccount Start=$(date +%FT%T -d "1 month ago") Format=used,login,account | head -20 top usage users during the last month
Do's & Don'ts
- Do not use srun for simple non-parallel jobs like cp, ln, rm, cat, g[un]zip
- Do not write loops in your slurm script to start several instance of similar jobs → See Job arrays below
- Make use of parallel srun p[gu]igz instead of g[un]zip if you have allocated more than one CPU already
- Do not allocate costly resources (like fat/gpu nodes) if you not need them. Check the CPU/Memory-Efficiency of your jobs with info.sh -S
...