SLURM

From HPC Wiki
Jump to navigation Jump to search

#SBATCH Usage

If you are writing a jobscript for a SLURM batch system, the magic cookie is "#SBATCH". To use it, start a new line in your script with "#SBATCH". Following that, you can put one of the parameters shown below, where the word written in <...> should be replaced with a value.

Basic settings:

Parameter Function
--job-name=<name> job name
--output=<path> path to the file where the job (error) output is written

Requesting resources:

Parameter Function
--time=<runlimit> runtime limit in the format hours:min:sec; once the time specified is up, the job will be killed by the scheduler
--mem=<memlimit> job memory request, usually an integer followed by a prefix for the unit (e. g. --mem=1G for 1 GB)

Parallel programming (read more here):

If you'd like to run a parallel job on a cluster that is managed by SLURM, you have to clarify that. Therefore, use the command "srun <my_executable>" in your jobscript.

Settings for OpenMP:

Parameter Function
--nodes=1 start a parallel job for a shared-memory system on only one node
--cpus-per-task=<num_threads> number of threads to execute OpenMP application with
--ntasks-per-core=<num_hyperthreads> number of hyperthreads per core; i. e. any value greater than 1 will

turn on hyperthreading (the possible maximum depends on your CPU)

--ntasks-per-node=1 for OpenMP, use one task per node only

Settings for MPI:

Parameter Function
--nodes=<num_nodes> start a parallel job for a distributed-memory system on several nodes
--cpus-per-task=1 for MPI, use one task per CPU
--ntasks-per-core=1 disable hyperthreading
--ntasks-per-node=<num_procs> number of processes per node (the possible maximum depends on

your nodes)


Jobscript Examples

This serial job will run a given executable, in this case "myapp.exe".

#!/bin/bash

### Job name
#SBATCH --job-name=MYJOB

### File for the output
#SBATCH --output=MYJOB_OUTPUT

### Time your job needs to execute, e. g. 15 min 30 sec
#SBATCH --time=00:15:30

### Memory your job needs, e. g. 1 GB
#SBATCH --mem=1G

### The last part consists of regular shell commands:
### Change to working directory
cd /home/usr/workingdirectory

### Execute your application
myapp.exe

This OpenMP job will start the parallel program "myapp.exe" with 24 threads.

#!/bin/bash

### Job name
#SBATCH --job-name=OMPJOB

### File for the output
#SBATCH --output=OMPJOB_OUTPUT

### Time your job needs to execute, e. g. 30 min
#SBATCH --time=00:30:00

### Memory your job needs, e. g. 500 MB
#SBATCH --mem=500M

### Use one node for parallel jobs on shared-memory systems
#SBATCH --nodes=1

### Number of threads to use, e. g. 24
#SBATCH --cpus-per-task=24

### Number of hyperthreads per core
#SBATCH --ntasks-per-core=1

### Tasks per node (for shared-memory parallelisation, use 1)
#SBATCH --ntasks-per-node=1

### The last part consists of regular shell commands:
### Set the number of threads in your cluster environment to the value specified above
export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK

### Change to working directory
cd /home/usr/workingdirectory

### Run your parallel application
srun myapp.exe

References

SBATCH documentation

SLURM jobscript generator