Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Similarly, the GPU nodes can be accessed via the gpu partition. Note, that the type and number of GPUs need to be specified. More infos about the hardware specification of each node can be found in the System Overview (TODO: Link).


Partition

Nodes

Description

smp

prod-[001-240]

  • default partition,

  • MaxNodes=1 → MaxCores=128,

  • Jobs can share a node

mpp

prod-[001-240]

  • exclusive access to nodes,

  • MaxNodes=240

fat

fat-00[1-4]

  • MaxNodes=1

  • Jobs can share a Node

gpu

gpu-00[1-2]

  • MaxNodes=1,

  • Jobs can share a node,

  • Note: You have to specify the type and number of GPUs --gpus=<GpuType>:<GpuNumber>

    • gpu-001: 2x a40

    • gpu-002: 4x a100

...

By default, the QOS 30min is used. It has a max. walltime of 30 minutes and jobs with this QOS get a higher priority and have access to a special SLURM reservation during working time (TODO: add details when set up), to facilitate development and testing. For longer runs, another QOS (and walltime) has to be specified. See table below. Note: long running jobs (longer than 12 hours, up to 48 hours) “cost” more in terms of fairshare.

QOS

max. walltime

UsageFactor

Priority QOS_factor

short

0:30:00

1

50

12h

12:00:00

1

0

48h

48:00:00

2

0

A short note on the definitions:

...

and RawUsage= cpu-seconds (#CPUs * seconds). → Jobs using the 48h QOS are twice as expensive when calculating job priorities (see Scheduling (TODO: Link)).


Example Scripts

Job arrays

...