Difference between revisions of "Batch-Scheduler"
m |
|||
Line 1: | Line 1: | ||
− | This page gives an overview of | + | This page gives an overview of how to use a Batch-Scheduler and what pitfalls may exist. A more general description of why Batch-Schedulers are needed can be found [[Scheduling_Basics|here]]. There are different Schedulers around, e.g. [[SLURM]], [[LSF]] and [[Torque]]. Click [[Schedulers|here]] to figure out which one you need. |
__TOC__ | __TOC__ | ||
== Usage == | == Usage == | ||
− | If you want to execute a program in the batch system, you need to submit a [[Jobscript]] | + | If you want to execute a program in the batch system, you need to submit a [[Jobscript|jobscript]] tailored for the used scheduler. With the help of this script the scheduler needs to learn: |
− | * | + | * How many resources your program needs (e.g. time and memory) |
− | * | + | * Which [[Parallel_Programming|parallelization]] you are using for your program |
− | There are in | + | While the specifics of how to provide this information depends on the used scheduler, some general rules apply to most of them. How to apply these rules should be answered in the [[Jobscript-examples|example scripts]] (or referenced there). Not all pitfalls apply to all batch systems. |
− | * | + | |
− | * | + | == Pitfalls == |
− | * | + | There are some general problems one needs to keep in mind: |
− | * | + | * If you request more resources than the hardware can offer, the scheduler might not reject the job (so that the job will be stuck in the queue forever). |
+ | * Be careful about whether the memory limit is per process or in total. | ||
+ | * The scheduler might not support [[Binding/Pinning|pinning]] so you might want to do this manually. | ||
+ | * There might be per-user quotas for the usage of the cluster. | ||
== Serial Jobs == | == Serial Jobs == | ||
− | == Shared Memory | + | Serial jobs execute programs which do not use any kind of parallelism. Thus, you typically only need to specify the time and memory resources your job needs. However, some batch systems allow exclusive and non-exclusive usage of nodes. Pay attention that you do not block a whole node for a program which just needs one core! |
− | == Distributed Memory | + | |
− | == Hybrid | + | == Shared Memory Jobs == |
+ | Speaking in hardware, shared memory parallelization means that you use multiple cores which are on the same node (and therefore share the memory). This means that you need to tell the scheduler that the requested cores should actually be on the same node. Furthermore, you should synchronize the number of threads spawned with the number of cores you requested (e.g. by explicitly setting the [[OpenMP]] environment variable). | ||
+ | |||
+ | == Distributed Memory Jobs == | ||
+ | This is usually done via [[MPI]] since it handles the correct start-up of the program. Again, pay attention that the MPI library and the resource requests match. | ||
+ | |||
+ | == Hybrid Jobs == | ||
+ | Hybrid parallelization means that you run a job on different nodes (e.g. using [[MPI]]) while using shared memory parallelization (e.g. [[OpenMP]]) on each of them. This means that you need to specify at least the number of nodes as well as that you want to use more than one core per node. Distributing the job across different nodes is usually handled by the scheduler. However, not all schedulers fully support the parallelization on each node. In this case, this has to be done manually. | ||
+ | |||
== Advanced Usage == | == Advanced Usage == | ||
− | + | Apart from the aforementioned types of jobs, the scheduler might offer even more types: | |
− | * | + | *Jobs across multiple nodes (distributed jobs or hybrid jobs) can also be parallelized without MPI. This goes beyond the scope of this page. |
− | * | + | *Jobs running several days should be split into smaller packages. Among the advantages are reduced queuing times and a higher stability (e.g. against node failure). The splitting can either be done by manually submitting or by using chain jobs. |
− | * | + | *Sometimes it may be necessary to run the same program with different arguments (e.g. determining hyperparameters). In this case an array job may be used. |
Revision as of 15:57, 14 November 2018
This page gives an overview of how to use a Batch-Scheduler and what pitfalls may exist. A more general description of why Batch-Schedulers are needed can be found here. There are different Schedulers around, e.g. SLURM, LSF and Torque. Click here to figure out which one you need.
Usage
If you want to execute a program in the batch system, you need to submit a jobscript tailored for the used scheduler. With the help of this script the scheduler needs to learn:
- How many resources your program needs (e.g. time and memory)
- Which parallelization you are using for your program
While the specifics of how to provide this information depends on the used scheduler, some general rules apply to most of them. How to apply these rules should be answered in the example scripts (or referenced there). Not all pitfalls apply to all batch systems.
Pitfalls
There are some general problems one needs to keep in mind:
- If you request more resources than the hardware can offer, the scheduler might not reject the job (so that the job will be stuck in the queue forever).
- Be careful about whether the memory limit is per process or in total.
- The scheduler might not support pinning so you might want to do this manually.
- There might be per-user quotas for the usage of the cluster.
Serial Jobs
Serial jobs execute programs which do not use any kind of parallelism. Thus, you typically only need to specify the time and memory resources your job needs. However, some batch systems allow exclusive and non-exclusive usage of nodes. Pay attention that you do not block a whole node for a program which just needs one core!
Speaking in hardware, shared memory parallelization means that you use multiple cores which are on the same node (and therefore share the memory). This means that you need to tell the scheduler that the requested cores should actually be on the same node. Furthermore, you should synchronize the number of threads spawned with the number of cores you requested (e.g. by explicitly setting the OpenMP environment variable).
Distributed Memory Jobs
This is usually done via MPI since it handles the correct start-up of the program. Again, pay attention that the MPI library and the resource requests match.
Hybrid Jobs
Hybrid parallelization means that you run a job on different nodes (e.g. using MPI) while using shared memory parallelization (e.g. OpenMP) on each of them. This means that you need to specify at least the number of nodes as well as that you want to use more than one core per node. Distributing the job across different nodes is usually handled by the scheduler. However, not all schedulers fully support the parallelization on each node. In this case, this has to be done manually.
Advanced Usage
Apart from the aforementioned types of jobs, the scheduler might offer even more types:
- Jobs across multiple nodes (distributed jobs or hybrid jobs) can also be parallelized without MPI. This goes beyond the scope of this page.
- Jobs running several days should be split into smaller packages. Among the advantages are reduced queuing times and a higher stability (e.g. against node failure). The splitting can either be done by manually submitting or by using chain jobs.
- Sometimes it may be necessary to run the same program with different arguments (e.g. determining hyperparameters). In this case an array job may be used.