Slurm: After Allocating All Gpus No More CPU Job Can Be Submitted

SLURM: After allocating all GPUs no more cpu job can be submitted

Make sure that SelectType in your configuration is CR_CPU or CR_Core and that the shared option of the partition is not set to exclusive. Otherwise Slurm allocates full nodes to jobs.

Correct usage of gpus-per-task for allocation of distinct GPUs via SLURM

This does what I want

srun --gres=gpu:1 bash -c 'CUDA_VISIBLE_DEVICES=$SLURM_PROCID env' | grep CUDA_VISIBLE

CUDA_VISIBLE_DEVICES=1
CUDA_VISIBLE_DEVICES=0

but doesn't make use of --gpus-per-task.



Related Topics



Leave a reply



Submit