Difference between revisions of "How to Use MPI"

From HPC Wiki
Jump to navigation Jump to search
 
(44 intermediate revisions by 7 users not shown)
Line 1: Line 1:
 +
[[Category:HPC-User]]
 
== Basics ==
 
== Basics ==
  
This will give you a general overview of how to compile and execute a program that has been [[Parallel_Programming|parallelized]] with [[MPI]]. Many of the options listed below are the same for both Open MPI and Intel MPI, however, be care if they do differentiate.
+
This page will give you a general overview of how to compile and execute a program that has been [[Parallel_Programming|parallelized]] with [[MPI]]. Many of the options listed below are the same for both Open MPI and Intel MPI, however, be careful and look up if they indeed behave the same way.
  
  
== Loading the Correct Modules ==
+
__TOC__
  
=== Open MPI ===
 
To ensure that the Open MPI module is loaded, check the output of the following command:
 
$ module list
 
  
In case it has not been loaded yet, type
+
== How to Compile MPI Code ==
$ module load openmpi
 
 
 
If you are currently using Intel MPI, type
 
$ module switch intelmpi openmpi
 
 
 
=== Intel MPI ===
 
Look for "intelmpi" in the output of this command:
 
$ module list
 
 
 
In case it did not appear in the output, you have to load the module by typing
 
$ module load intelmpi
 
  
If Open MPI has already been loaded, switch to Intel MPI like this:
+
Before continuing, please make sure that the openmpi or intelmpi module is loaded (go [[Modules|here]] to see how to load/switch modules).
$ module switch openmpi intelmpi
 
  
 
+
There are several so called MPI "compiler wrappers", e.g. <code>mpicc</code>. These take care of including the correct MPI libraries for the programming language you are using. But they share most command line options. Depending on whether your code is written in C, C++ or Fortran, follow the instructions in one of the tables below. Make sure to replace the arguments inside <code><…></code> with specific values.
== How to Compile MPI Code ==
 
 
 
There are several so called MPI "compiler wrappers", e.g. "mpicc". These take care of including the correct MPI libraries for each programming language etc. for you. But they share most command line options. Depending on whether your code is written in C, C++ or Fortran, follow the instructions in one of the tables below. Make sure to replace the arguments inside <…> with specific values.
 
  
 
=== Open MPI ===
 
=== Open MPI ===
  
Use the following command to specify the program you would like to compile (replace <src_file> with a path to your code, e. g. ./myprog.c).  
+
Use the following command to specify the program you would like to compile (replace <code><src_file></code> with a path to your code, e. g. <code>./myprog.c</code>).  
 
{| class="wikitable" style="width: 40%;"
 
{| class="wikitable" style="width: 40%;"
 
| Language || Command
 
| Language || Command
Line 39: Line 22:
 
| C || <code>$ mpicc <src_file> -o <name_of_executable></code>
 
| C || <code>$ mpicc <src_file> -o <name_of_executable></code>
 
|-
 
|-
| C++ || <code>$ mpicxx <src_file> -o <name_of_executable</code>
+
| C++ || <code>$ mpicxx <src_file> -o <name_of_executable></code>
 
|-
 
|-
 
| Fortran || <code>$ mpifort <src_file> -o <name_of_executable></code>
 
| Fortran || <code>$ mpifort <src_file> -o <name_of_executable></code>
 
|}
 
|}
  
You can also type the command <code>$ mpicc [options]</code>, <code>$ mpicxx [options]</code> or <code>$ mpifort [options]</code>. There are a few options that come with Open MPI, however, options are more important for running your program. The compiler options might be useful to fetch more information about the Open MPI module you're using.
+
You can also type the command <code>$ mpicc [options]</code>, <code>$ mpicxx [options]</code> or <code>$ mpifort [options]</code>. There are a few options that come with Open MPI, however, options are more important for running your program. The compiler options might be useful to fetch more information about the Open MPI module you are using. Compile options unknown to the MPI compiler wrapper are simply forwarded to the underlying [[Compiler|compiler]] e.g. <code>icc</code>.
 
{| class="wikitable" style="width: 40%;"
 
{| class="wikitable" style="width: 40%;"
 
|Options || Function
 
|Options || Function
Line 53: Line 36:
 
|}
 
|}
  
For RWTH cluster users:
+
Instead of typing the compiler wrapper <code>mpicc</code>, <code>mpicxx</code> or <code>mpifort</code> explicitly, on most systems (e.g. the RWTH Compute Cluster) there are environment variables defined, which you can use to call the MPI compiler in a more general manner. Simple use <code>$MPICC</code>, <code>$MPICXX</code> or <code>$MPIFC</code> for the compiler you want to use and let the module system handle the dirty details of using the appropriate command.
Instead of typing the compiler wrapper <code>mpicc</code> etc., you can simply put one of the environment variables <code>$MPICC</code>, <code>$MPICXX</code> or <code>$MPIFC</code> for Fortran codes. They are already set by the module system so that you do not have to worry about which compiler module to use.
 
  
 
=== Intel MPI ===
 
=== Intel MPI ===
 +
The Intel MPI is shipped with ''two'' sets of compiler wrappers (for GCC and for Intel compilers); it is rather important to use the right ones. Especially when using any software building tolls like [[Cmake|CMake]], double-check carefully about the right compiler wrappers being actually used, otherwise you will call the wrong compilers internally.
  
Use the following command to specify the program you would like to compile (replace <src_file> with a path to your code, e. g. ./myprog.c).  
+
Use the following command to specify the program you would like to compile (replace <code><src_file></code> with a path to your code, e. g. <code>./myprog.c</code>).  
{| class="wikitable" style="width: 40%;"
+
{| class="wikitable" style="width: 70%;"
 
| Compiler Driver || C || C++ || Fortran
 
| Compiler Driver || C || C++ || Fortran
 
|-
 
|-
Line 67: Line 50:
 
|}
 
|}
  
You can also type the command <code>$ mpicc [options] <src_file> -o <name></code> etc., where [options] can be replaced with one or more of the ones listed below. Intel MPI comes with rather advanced compiler options, that are mainly aimed at optimization and analyzing your code with the help of Intel tools.
+
You can also type the command <code>$ mpicc [options] <src_file> -o <name></code> etc., where <code>[options]</code> can be replaced with one or more of the ones listed below. Intel MPI comes with rather advanced compiler options, that are mainly aimed at optimization and analyzing your code with the help of Intel tools.
{| class="wikitable" style="width: 40%;"
+
{| class="wikitable" style="width: 70%;"
 
|Options || Function
 
|Options || Function
 
|-
 
|-
| -g || enable debug mode
+
| -g || enable debugging information
 
|-  
 
|-  
| -O || enable compiler optimization
+
| -OX || enable compiler optimization, where <code>X</code> represents the optimization level and is one of 0, 1, 2, 3
 
|-
 
|-
| -v || print compiler version
+
| -v || print the compiler version
 
|}
 
|}
  
For RWTH cluster users:
+
Instead of typing the compiler wrapper <code>mpicc</code>, <code>mpicxx</code> or <code>mpifort</code> explicitly, on most systems (e.g. the RWTH Compute Cluster) there are environment variables defined, which you can use to call the MPI compiler in a more general manner. Simple use <code>$MPICC</code>, <code>$MPICXX</code> or <code>$MPIFC</code> for the compiler you want to use and let the module system handle the dirty details of using the appropriate command.
Instead of typing the compiler wrapper <code>mpicc</code> etc., you can simply put one of the environment variables <code>$MPICC</code>, <code>$MPICXX</code> or <code>$MPIFC</code> for Fortran codes. They are already set by the module system so that you do not have to worry about which compiler module to use.
 
 
 
  
 
== How to Run an MPI Executable ==
 
== How to Run an MPI Executable ==
  
Ensure that your the correct MPI module is loaded (see [[#Loading_the_Correct_Modules|here]]). Once again, the command line options slightly differ between Intel MPI and Open MPI.
+
Ensure that the correct MPI module is loaded (go [[Modules|here]] to see how to load/switch modules). Once again, the command line options slightly differ between Intel MPI and Open MPI.
In order to start any MPI program, type the following command where <executable> specifies the path to your application:
+
In order to start any MPI program, type the following command where <code><executable></code> specifies the path to your application:
 
  $ mpirun -n <num_procs> [options] <executable>
 
  $ mpirun -n <num_procs> [options] <executable>
 
Note that <code>mpiexec</code> and <code>mpirun</code> are synonymous in Open MPI, in Intel MPI it's <code>mpiexec.hydra</code> and <code>mpirun</code>.
 
Note that <code>mpiexec</code> and <code>mpirun</code> are synonymous in Open MPI, in Intel MPI it's <code>mpiexec.hydra</code> and <code>mpirun</code>.
  
Don’t forget to put the -np” or -n” option as explained below. All the other options listed below are not mandatory.
+
Don’t forget to put the <code>-np</code> or <code>-n</code> option as explained below. All the other options listed below are not mandatory.
  
 
=== Open MPI ===
 
=== Open MPI ===
Line 103: Line 84:
 
|-
 
|-
 
| -wdir <directory> || change to directory specified before executing the program
 
| -wdir <directory> || change to directory specified before executing the program
|-
 
| -nw || complete command when all MPI processes have been launched successfully
 
 
|-
 
|-
 
| -path <path> || look for executables in the directory specified
 
| -path <path> || look for executables in the directory specified
Line 124: Line 103:
 
| -n <num_procs> || number of processes to run
 
| -n <num_procs> || number of processes to run
 
|-
 
|-
| -ppn <num_procs> || number of processes per node
+
| -ppn <num_procs> || number of processes per node; for that to work, it may be necessary to set the environment variable <code>I_MPI_JOB_RESPECT_PROCESS_PLACEMENT=off</code>
 
|-
 
|-
 
| -wdir <directory> || change to directory specified before executing the program
 
| -wdir <directory> || change to directory specified before executing the program
Line 135: Line 114:
 
|}
 
|}
  
 +
=== Process Binding ===
  
== Options for Binding in Open MPI ==
+
Binding processes means telling your system how to place the processes onto the architecture. This can be done by adding command-line options when calling <code>mpiexec</code> and/or setting some environment variables (both verndor-specific), and may enhance (or kill) the performance of your application. In order to learn more about that, go here for [[Binding/Pinning#Options_for_Binding_in_Open_MPI|Open MPI]] and there for [[Binding/Pinning#Options_for_Binding_in_Intel_MPI|Intel MPI]].
  
Binding processes to certain hardware units can be done by specifying the options below when executing a program. This is a more advanced way of running an application and also requires knowledge about your system's architecture, e. g. how many cores there are. If none of these options are given, default values are set.
+
In most MPI vendors the Process Binding is '''on''' by default nowadays. All MPIs assume there is only ''one'' MPI job running on hardware at any time. If you start more than one MPI job (e.g. in context of non-exclusive batch job in a batch sheduler w/o binding management, or just interactive tests), all '''N''' jobs would belive they are only children, ''and would bind themselves to the very same cores'', resulting in ''1/'''N''''' performance while remainig cores idling.
  
{| class="wikitable" style="width: 80%;"
+
== References ==
| Option || Function || Explanation
 
|-
 
| --bind-to <arg> || bind to the hardware component; arg can be one of: none, hwthread, core, l1cache, l2cache, l3cache, socket, numa, board; default values: "none" for np <= 2, "socket" otherwise ||
 
|-
 
| -bind-to-core || bind processes to cores ||
 
|-
 
| -bind-to-socket || bind processes to sockets ||
 
|}
 
  
 +
[https://software.intel.com/content/www/us/en/develop/documentation/mpi-developer-reference-linux/top/command-reference/compilation-commands.html Intel MPI compiler options]
  
== References ==
+
Manual page for [https://www.open-mpi.org/doc/v4.0/man1/mpiexec.1.php Open MPI's] and [https://software.intel.com/content/www/us/en/develop/documentation/mpi-developer-reference-linux/top/command-reference/mpirun.html Intel MPI's] mpiexec/mpirun command.
 
 
[https://software.intel.com/en-us/mpi-developer-reference-linux-compiler-command-options Intel MPI compiler options]
 

Latest revision as of 09:13, 12 June 2020

Basics

This page will give you a general overview of how to compile and execute a program that has been parallelized with MPI. Many of the options listed below are the same for both Open MPI and Intel MPI, however, be careful and look up if they indeed behave the same way.



How to Compile MPI Code

Before continuing, please make sure that the openmpi or intelmpi module is loaded (go here to see how to load/switch modules).

There are several so called MPI "compiler wrappers", e.g. mpicc. These take care of including the correct MPI libraries for the programming language you are using. But they share most command line options. Depending on whether your code is written in C, C++ or Fortran, follow the instructions in one of the tables below. Make sure to replace the arguments inside <…> with specific values.

Open MPI

Use the following command to specify the program you would like to compile (replace <src_file> with a path to your code, e. g. ./myprog.c).

Language Command
C $ mpicc <src_file> -o <name_of_executable>
C++ $ mpicxx <src_file> -o <name_of_executable>
Fortran $ mpifort <src_file> -o <name_of_executable>

You can also type the command $ mpicc [options], $ mpicxx [options] or $ mpifort [options]. There are a few options that come with Open MPI, however, options are more important for running your program. The compiler options might be useful to fetch more information about the Open MPI module you are using. Compile options unknown to the MPI compiler wrapper are simply forwarded to the underlying compiler e.g. icc.

Options Function
-showme:help print a short help message about the usage and lists all compiler options
-showme:version show Open MPI version

Instead of typing the compiler wrapper mpicc, mpicxx or mpifort explicitly, on most systems (e.g. the RWTH Compute Cluster) there are environment variables defined, which you can use to call the MPI compiler in a more general manner. Simple use $MPICC, $MPICXX or $MPIFC for the compiler you want to use and let the module system handle the dirty details of using the appropriate command.

Intel MPI

The Intel MPI is shipped with two sets of compiler wrappers (for GCC and for Intel compilers); it is rather important to use the right ones. Especially when using any software building tolls like CMake, double-check carefully about the right compiler wrappers being actually used, otherwise you will call the wrong compilers internally.

Use the following command to specify the program you would like to compile (replace <src_file> with a path to your code, e. g. ./myprog.c).

Compiler Driver C C++ Fortran
GCC $ mpicc <src_file> -o <name> $ mpicpc <src_file> -o <name> $ mpifort <src_file> -o <name>
Intel $ mpiicc <src_file> -o <name> $ mpiicpc <src_file> -o <name> $ mpiifort <src_file> -o <name>

You can also type the command $ mpicc [options] <src_file> -o <name> etc., where [options] can be replaced with one or more of the ones listed below. Intel MPI comes with rather advanced compiler options, that are mainly aimed at optimization and analyzing your code with the help of Intel tools.

Options Function
-g enable debugging information
-OX enable compiler optimization, where X represents the optimization level and is one of 0, 1, 2, 3
-v print the compiler version

Instead of typing the compiler wrapper mpicc, mpicxx or mpifort explicitly, on most systems (e.g. the RWTH Compute Cluster) there are environment variables defined, which you can use to call the MPI compiler in a more general manner. Simple use $MPICC, $MPICXX or $MPIFC for the compiler you want to use and let the module system handle the dirty details of using the appropriate command.

How to Run an MPI Executable

Ensure that the correct MPI module is loaded (go here to see how to load/switch modules). Once again, the command line options slightly differ between Intel MPI and Open MPI. In order to start any MPI program, type the following command where <executable> specifies the path to your application:

$ mpirun -n <num_procs> [options] <executable>

Note that mpiexec and mpirun are synonymous in Open MPI, in Intel MPI it's mpiexec.hydra and mpirun.

Don’t forget to put the -np or -n option as explained below. All the other options listed below are not mandatory.

Open MPI

Option Function
-np <num_procs> or -n <num_procs> number of processes to run
-npersocket <num_procs> number of processes per socket
-npernode <num_procs> number of processes per node
-wdir <directory> change to directory specified before executing the program
-path <path> look for executables in the directory specified
-q or -quiet suppress helpful messages
-output-filename <name> redirect output into the file <name>.<rank>
-x <env_variable> export the specified environment variable to the remote nodes where the program will be executed
--help list all options available with an explanation

Intel MPI

Option Function
-n <num_procs> number of processes to run
-ppn <num_procs> number of processes per node; for that to work, it may be necessary to set the environment variable I_MPI_JOB_RESPECT_PROCESS_PLACEMENT=off
-wdir <directory> change to directory specified before executing the program
-path <path> look for executables in the directory specified
-outfile-pattern <name> redirect stdout to file
--help list all options available with an explanation

Process Binding

Binding processes means telling your system how to place the processes onto the architecture. This can be done by adding command-line options when calling mpiexec and/or setting some environment variables (both verndor-specific), and may enhance (or kill) the performance of your application. In order to learn more about that, go here for Open MPI and there for Intel MPI.

In most MPI vendors the Process Binding is on by default nowadays. All MPIs assume there is only one MPI job running on hardware at any time. If you start more than one MPI job (e.g. in context of non-exclusive batch job in a batch sheduler w/o binding management, or just interactive tests), all N jobs would belive they are only children, and would bind themselves to the very same cores, resulting in 1/N performance while remainig cores idling.

References

Intel MPI compiler options

Manual page for Open MPI's and Intel MPI's mpiexec/mpirun command.