...
Model | User | Pro SMT | Contra SMT | ||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
idle | admin | - | (∑ Esocket[0-7] according to lm_sensors: | ||||||||||||||||||||||||||||||||||||||||||
stress-ng stream | admin | -- | ~13% slower | ||||||||||||||||||||||||||||||||||||||||||
FESOM | NEC | Using 128 Threads per node: 3% faster (probably because the (buggy) GXFS daemon can use a virtual core) | Using 256 Thread per node: 10% slower | ||||||||||||||||||||||||||||||||||||||||||
Python AI | vhelm | no impact/difference | no impact/difference | ||||||||||||||||||||||||||||||||||||||||||
matlab #SBATCH --cpus-per-task=16 | vhelm | Runtime: 1440s instead of 1366s → ~5% slower | |||||||||||||||||||||||||||||||||||||||||||
unzip 262 about ~50 MB files in parallel: S=$(date +%s); parallel -P$P gunzip -c > /dev/null ::: /tmp/input/* ; echo "$(( $(date +%s) - $S )) sec" salloc -psmp --qos=12h --time=12:00:00 --ntasks-per-node=128 salloc -psmpht --qos=12h --time=12:00:00 --ntasks-per-node=256 --mem=249G | lkalesch mthoma | If a user does only request --ntasks-per-node=1 (the default) and uses P=0 (use all available cores) then you get
But please note: This is a improper use of slurm/HPC, so this "advantage" does not justify SMT | no advantage (if slurm is used properly)
|
...