Difference between revisions of "Hybrid Slurm Job"

From HPC Wiki
Jump to navigation Jump to search
m
m
 
(One intermediate revision by one other user not shown)
Line 1: Line 1:
[[SLURM|Slurm]] is a popular workload manager / job scheduler.  
+
[[Category:HPC-User]]
 +
[[Category:Scheduler-Examples]]
 +
[[SLURM|Slurm]] is a popular workload manager / job scheduler.
 
Here you can find an example of job script to launch a program which is parallelized using MPI and OpenMP at the same time.
 
Here you can find an example of job script to launch a program which is parallelized using MPI and OpenMP at the same time.
 
You may find the toy program useful to get started.
 
You may find the toy program useful to get started.
Line 76: Line 78:
 
  node: ncm1019.hpc.itc.rwth-aachen.de  rank:          3 , thread_id:          2
 
  node: ncm1019.hpc.itc.rwth-aachen.de  rank:          3 , thread_id:          2
 
</syntaxhighlight>
 
</syntaxhighlight>
 +
 +
== Taking NUMA into Account ==
 +
t.b.a.

Latest revision as of 08:18, 4 September 2019

Slurm is a popular workload manager / job scheduler. Here you can find an example of job script to launch a program which is parallelized using MPI and OpenMP at the same time. You may find the toy program useful to get started.

Slurm Job Script

This hybrid MPI+OpenMP job will start the parallel program "hello.exe" with 4 MPI processes and 3 OpenMP threads each on 2 compute nodes.

#!/bin/zsh

### Job name
#SBATCH --job-name=HelloHybrid

### 2 compute nodes
#SBATCH --nodes=2

### 4 MPI ranks
#SBATCH --ntasks=4

### 2 MPI ranks per node
#SBATCH --ntasks-per-node=2

### 3 tasks per MPI rank
#SBATCH --cpus-per-task=3

### the number of OpenMP threads 
export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK

### Change to working directory
cd /home/usr/workingdirectory

### Run your parallel application
srun hello.exe

Hybrid Fortran Toy Program

You can use this hybrid toy Fortran90 program to test the above job script

program hello
   use mpi
   use omp_lib

   integer rank, size, ierror, tag, status(MPI_STATUS_SIZE),threadid
   character*(MPI_MAX_PROCESSOR_NAME) name
   
   call MPI_INIT(ierror)
   call MPI_COMM_SIZE(MPI_COMM_WORLD, size, ierror)
   call MPI_COMM_RANK(MPI_COMM_WORLD, rank, ierror)
   call MPI_GET_PROCESSOR_NAME(name,len,ierror)
   
!$omp parallel private(threadid)
   threadid=omp_get_thread_num()
   print*, 'node: ', trim(name), '  rank:', rank, ', thread_id:', threadid
!$omp end parallel
   
   call MPI_FINALIZE(ierror)
   
end program

Job Output Example

When sorting the program output it may look like

 node: ncm1018.hpc.itc.rwth-aachen.de  rank:           0 , thread_id:           0
 node: ncm1018.hpc.itc.rwth-aachen.de  rank:           0 , thread_id:           1
 node: ncm1018.hpc.itc.rwth-aachen.de  rank:           0 , thread_id:           2
 node: ncm1018.hpc.itc.rwth-aachen.de  rank:           1 , thread_id:           0
 node: ncm1018.hpc.itc.rwth-aachen.de  rank:           1 , thread_id:           1
 node: ncm1018.hpc.itc.rwth-aachen.de  rank:           1 , thread_id:           2
 node: ncm1019.hpc.itc.rwth-aachen.de  rank:           2 , thread_id:           0
 node: ncm1019.hpc.itc.rwth-aachen.de  rank:           2 , thread_id:           1
 node: ncm1019.hpc.itc.rwth-aachen.de  rank:           2 , thread_id:           2
 node: ncm1019.hpc.itc.rwth-aachen.de  rank:           3 , thread_id:           0
 node: ncm1019.hpc.itc.rwth-aachen.de  rank:           3 , thread_id:           1
 node: ncm1019.hpc.itc.rwth-aachen.de  rank:           3 , thread_id:           2

Taking NUMA into Account

t.b.a.