Difference between revisions of "Batch-Scheduler"

From HPC Wiki
Jump to navigation Jump to search
m
Line 4: Line 4:
  
 
== Usage ==
 
== Usage ==
If you want to execute a program in the batch system, you need to submit the [[Jobscript]] you have written for the Scheduler used on the batch system. In this Jobscript, the Scheduler needs to learn:
+
If you want to execute a program in the batch system, you need to submit a [[Jobscript]] you have written for the Scheduler used on the batch system. In this Jobscript, the Scheduler needs to learn:
 
* how many resources your program needs (e.g. time and memory)
 
* how many resources your program needs (e.g. time and memory)
 
* how you want to parallelize your program
 
* how you want to parallelize your program

Revision as of 14:48, 14 November 2018

This page gives an overview of what a Batch-Scheduler can do and what pitfalls may exist. A more general description of why Batch-Schedulers are needed can be found here. There are different Schedulers around, e.g. SLURM, LSF and Torque. Click here to figure out which one you need.

Usage

If you want to execute a program in the batch system, you need to submit a Jobscript you have written for the Scheduler used on the batch system. In this Jobscript, the Scheduler needs to learn:

  • how many resources your program needs (e.g. time and memory)
  • how you want to parallelize your program

There are in general four types of parallelization:

  • serial (no parallelization)
  • shared memory only (e.g. OpenMP)
  • distributed memory only (e.g. MPI)
  • hybrid parallelisation

Serial Jobs

Shared Memory Parallelization

Distributed Memory Parallelization

Hybrid Parallelization

Advanced Usage

Here is stuff about

  • brief mentioning of non-mpi-multi-noding
  • that you should split long-runners (aka Chain Jobs) and why
  • a brief mentioning of array jobs