Jobscript

From HPC Wiki
Jump to navigation Jump to search

General

A jobscript can be used to submit the job you wish to execute to a batch system. It is very similar to a sh-file and generally uses the same format, but is more powerful. Besides shell commands, you can put the so called magic cookie #BSUB. This allows you to specify a lot of parameters, e. g. the time and memory your application requires or - if your code runs in parallel - the number of compute slots to employ.


Structure

Like a regular sh-file, your jobscript should start with a shebang (#!), e. g. in case you are using a z-shell:

#!/bin/env zsh

Usually, this first line is followed by several "#BSUB" directives, that are explained in more depth in the next section. The third part of a jobscript consists of shell commands, for example, to change to your working directory and to execute your application.


#BSUB Usage

To use the magic cookie, start a new line in your script with "#BSUB". Following that, you can put one of the parameters shown below, where the word written in <...> should be replaced with a value.

Basic settings:

Parameter Function
-J <name> job name
-o <path> path to the file where the job output is written
-e <path> path to the file for the job error output

Requesting resources:

Parameter Function Default
-W <runlimit> runtime limit in the format [hour:]minute; once the time specified is up, the job will be killed by the scheduler 00:15
-M <memlimit> memory limit per process in MB 512
-S <stacklimit> limit of stack size per process in MB 10

Parallel programming (read more here):

Parameter Function
-a openmp start a parallel job for a shared-memory system
-n <num_threads> number of threads to execute OpenMP application with
-a openmpi start a parallel job for a distributed-memory system
-n <num_procs> number of processes to execute MPI application with

The big advantage of jobscripts is that the parameters that are prefixed with "#BSUB" are treated just like command line arguments. By setting them inside your jobscript already, it's easier to adjust them or look them up later.


Job Submission

This command submits your job to a batch system that controls the resources for computation.

$ bsub < jobscript.sh

Note that all incoming jobs (defined in a jobscript) are added to a queue. When to run a job, is decided by the scheduler. The waiting time depends on various factors, e. g. the time and memory you asked for in your jobscript. The rule of thumb is: the more resources your job needs, the longer it will be queued.

You can always check the current status (pend or run) of your submitted jobs and their ids with the following shell command. Once your jobs have finished, the command will print "No unfinished jobs found".

$ bjobs

In order to remove a job that you submitted, you can type this command:

$ bkill <job_id>


References

Jobscript examples