Hybrid Slurm Job

From HPC Wiki
Revision as of 07:18, 4 September 2019 by Daniel-schurhoff-de23@rwth-aachen.de (talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Slurm is a popular workload manager / job scheduler. Here you can find an example of job script to launch a program which is parallelized using MPI and OpenMP at the same time. You may find the toy program useful to get started.

Slurm Job Script

This hybrid MPI+OpenMP job will start the parallel program "hello.exe" with 4 MPI processes and 3 OpenMP threads each on 2 compute nodes.

#!/bin/zsh

### Job name
#SBATCH --job-name=HelloHybrid

### 2 compute nodes
#SBATCH --nodes=2

### 4 MPI ranks
#SBATCH --ntasks=4

### 2 MPI ranks per node
#SBATCH --ntasks-per-node=2

### 3 tasks per MPI rank
#SBATCH --cpus-per-task=3

### the number of OpenMP threads 
export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK

### Change to working directory
cd /home/usr/workingdirectory

### Run your parallel application
srun hello.exe

Hybrid Fortran Toy Program

You can use this hybrid toy Fortran90 program to test the above job script

program hello
   use mpi
   use omp_lib

   integer rank, size, ierror, tag, status(MPI_STATUS_SIZE),threadid
   character*(MPI_MAX_PROCESSOR_NAME) name
   
   call MPI_INIT(ierror)
   call MPI_COMM_SIZE(MPI_COMM_WORLD, size, ierror)
   call MPI_COMM_RANK(MPI_COMM_WORLD, rank, ierror)
   call MPI_GET_PROCESSOR_NAME(name,len,ierror)
   
!$omp parallel private(threadid)
   threadid=omp_get_thread_num()
   print*, 'node: ', trim(name), '  rank:', rank, ', thread_id:', threadid
!$omp end parallel
   
   call MPI_FINALIZE(ierror)
   
end program

Job Output Example

When sorting the program output it may look like

 node: ncm1018.hpc.itc.rwth-aachen.de  rank:           0 , thread_id:           0
 node: ncm1018.hpc.itc.rwth-aachen.de  rank:           0 , thread_id:           1
 node: ncm1018.hpc.itc.rwth-aachen.de  rank:           0 , thread_id:           2
 node: ncm1018.hpc.itc.rwth-aachen.de  rank:           1 , thread_id:           0
 node: ncm1018.hpc.itc.rwth-aachen.de  rank:           1 , thread_id:           1
 node: ncm1018.hpc.itc.rwth-aachen.de  rank:           1 , thread_id:           2
 node: ncm1019.hpc.itc.rwth-aachen.de  rank:           2 , thread_id:           0
 node: ncm1019.hpc.itc.rwth-aachen.de  rank:           2 , thread_id:           1
 node: ncm1019.hpc.itc.rwth-aachen.de  rank:           2 , thread_id:           2
 node: ncm1019.hpc.itc.rwth-aachen.de  rank:           3 , thread_id:           0
 node: ncm1019.hpc.itc.rwth-aachen.de  rank:           3 , thread_id:           1
 node: ncm1019.hpc.itc.rwth-aachen.de  rank:           3 , thread_id:           2

Taking NUMA into Account

t.b.a.