Difference between revisions of "How to Use MPI"

From HPC Wiki
Jump to navigation Jump to search
 
(71 intermediate revisions by 7 users not shown)
Line 1: Line 1:
 +
[[Category:HPC-User]]
 
== Basics ==
 
== Basics ==
  
This will give you a general overview of how to compile and execute a program that has been [[Parallel_Programming|parallelized]] with [[MPI]]. Many of the options listed below are the same for both OpenMPI and Intel MPI, however, be care if they do differentiate.
+
This page will give you a general overview of how to compile and execute a program that has been [[Parallel_Programming|parallelized]] with [[MPI]]. Many of the options listed below are the same for both Open MPI and Intel MPI, however, be careful and look up if they indeed behave the same way.
  
== Loading the Correct Modules ==
 
  
=== OpenMPI ===
+
__TOC__
To ensure that the OpenMPI module is loaded, check the output of the following command:
 
$ module list
 
  
In case it has not been loaded yet, type
 
$ module load openmpi
 
  
If you are currently using IntelMPIt, type
+
== How to Compile MPI Code ==
$ module switch intelmpi openmpi
 
  
=== IntelMPI ===
+
Before continuing, please make sure that the openmpi or intelmpi module is loaded (go [[Modules|here]] to see how to load/switch modules).
Look for "intelmpi" in the output of this command:
 
$ module list
 
  
In case it did not appear in the output, you have to load the module by typing
+
There are several so called MPI "compiler wrappers", e.g. <code>mpicc</code>. These take care of including the correct MPI libraries for the programming language you are using. But they share most command line options. Depending on whether your code is written in C, C++ or Fortran, follow the instructions in one of the tables below. Make sure to replace the arguments inside <code><…></code> with specific values.
$ module load intelmpi
 
  
If OpenMPI has already been loaded, switch to IntelMPI like this:
+
=== Open MPI ===
$ module switch openmpi intelmpi
 
  
 +
Use the following command to specify the program you would like to compile (replace <code><src_file></code> with a path to your code, e. g. <code>./myprog.c</code>).
 +
{| class="wikitable" style="width: 40%;"
 +
| Language || Command
 +
|-
 +
| C || <code>$ mpicc <src_file> -o <name_of_executable></code>
 +
|-
 +
| C++ || <code>$ mpicxx <src_file> -o <name_of_executable></code>
 +
|-
 +
| Fortran || <code>$ mpifort <src_file> -o <name_of_executable></code>
 +
|}
  
== How to Compile MPI Code ==
+
You can also type the command <code>$ mpicc [options]</code>, <code>$ mpicxx [options]</code> or <code>$ mpifort [options]</code>. There are a few options that come with Open MPI, however, options are more important for running your program. The compiler options might be useful to fetch more information about the Open MPI module you are using. Compile options unknown to the MPI compiler wrapper are simply forwarded to the underlying [[Compiler|compiler]] e.g. <code>icc</code>.
 +
{| class="wikitable" style="width: 40%;"
 +
|Options || Function
 +
|-
 +
| -showme:help || print a short help message about the usage and lists all compiler options
 +
|-
 +
| -showme:version || show Open MPI version
 +
|}
  
There are two MPI compilers, one for C/C++ and one for Fortran, but they share most command line options. Depending on whether your code is written in C or Fortran, follow the instructions in one of the tables below. Make sure to replace the arguments inside <…> with specific values.
+
Instead of typing the compiler wrapper <code>mpicc</code>, <code>mpicxx</code> or <code>mpifort</code> explicitly, on most systems (e.g. the RWTH Compute Cluster) there are environment variables defined, which you can use to call the MPI compiler in a more general manner. Simple use <code>$MPICC</code>, <code>$MPICXX</code> or <code>$MPIFC</code> for the compiler you want to use and let the module system handle the dirty details of using the appropriate command.
  
 +
=== Intel MPI ===
 +
The Intel MPI is shipped with ''two'' sets of compiler wrappers (for GCC and for Intel compilers); it is rather important to use the right ones. Especially when using any software building tolls like [[Cmake|CMake]], double-check carefully about the right compiler wrappers being actually used, otherwise you will call the wrong compilers internally.
  
=== OpenMPI ===
+
Use the following command to specify the program you would like to compile (replace <code><src_file></code> with a path to your code, e. g. <code>./myprog.c</code>).  
 
+
{| class="wikitable" style="width: 70%;"
Use the following command to specify the program you would like to compile (replace <src_file> with a path to your code, e. g. ./myprog.c).  
+
| Compiler Driver || C || C++ || Fortran
{| class="wikitable" style="width: 40%;"
+
|-
| C/C++ || Fortran
+
| GCC || <code>$ mpicc <src_file> -o <name></code> || <code>$ mpicpc <src_file> -o <name></code> || <code>$ mpifort <src_file> -o <name></code>
 
|-
 
|-
| <code>$ mpicc <src_file></code> || <code>$ mpifort <src_file></code>
+
| Intel || <code>$ mpiicc <src_file> -o <name></code> || <code>$ mpiicpc <src_file> -o <name></code> || <code>$ mpiifort <src_file> -o <name></code>
 
|}
 
|}
  
You can also type the command <code>$ mpicc [options]</code> or <code>$ mpifort [options]</code>. There are a few options that come with OpenMPI, however, options are more important for running your program. The compiler options might be useful to fetch more information about the OpenMPI module you're using.
+
You can also type the command <code>$ mpicc [options] <src_file> -o <name></code> etc., where <code>[options]</code> can be replaced with one or more of the ones listed below. Intel MPI comes with rather advanced compiler options, that are mainly aimed at optimization and analyzing your code with the help of Intel tools.
{| class="wikitable" style="width: 40%;"
+
{| class="wikitable" style="width: 70%;"
 
|Options || Function
 
|Options || Function
 
|-
 
|-
| -showme:help || print a short help message about the usage and lists all compiler options
+
| -g || enable debugging information
 
|-  
 
|-  
| -showme:version || show OpenMPI version
+
| -OX || enable compiler optimization, where <code>X</code> represents the optimization level and is one of 0, 1, 2, 3
 +
|-
 +
| -v || print the compiler version
 
|}
 
|}
  
 +
Instead of typing the compiler wrapper <code>mpicc</code>, <code>mpicxx</code> or <code>mpifort</code> explicitly, on most systems (e.g. the RWTH Compute Cluster) there are environment variables defined, which you can use to call the MPI compiler in a more general manner. Simple use <code>$MPICC</code>, <code>$MPICXX</code> or <code>$MPIFC</code> for the compiler you want to use and let the module system handle the dirty details of using the appropriate command.
  
 
== How to Run an MPI Executable ==
 
== How to Run an MPI Executable ==
  
Ensure that your the correct MPI module is loaded (see [[#Loading_the_Correct_Modules|here]]).
+
Ensure that the correct MPI module is loaded (go [[Modules|here]] to see how to load/switch modules). Once again, the command line options slightly differ between Intel MPI and Open MPI.
 +
In order to start any MPI program, type the following command where <code><executable></code> specifies the path to your application:
 +
$ mpirun -n <num_procs> [options] <executable>
 +
Note that <code>mpiexec</code> and <code>mpirun</code> are synonymous in Open MPI, in Intel MPI it's <code>mpiexec.hydra</code> and <code>mpirun</code>.
 +
 
 +
Don’t forget to put the <code>-np</code> or <code>-n</code> option as explained below. All the other options listed below are not mandatory.
 +
 
 +
=== Open MPI ===
  
Don’t forget to put the “-np” or “-n” option as explained below. All the other options listed below are not mandatory.
 
 
{| class="wikitable" style="width: 60%;"
 
{| class="wikitable" style="width: 60%;"
 
| Option || Function
 
| Option || Function
 
|-
 
|-
 
| -np <num_procs> or  -n <num_procs> || number of processes to run
 
| -np <num_procs> or  -n <num_procs> || number of processes to run
 +
|-
 +
| -npersocket <num_procs> || number of processes per socket
 +
|-
 +
| -npernode <num_procs> || number of processes per node
 +
|-
 +
| -wdir <directory> || change to directory specified before executing the program
 
|-
 
|-
 
| -path <path> || look for executables in the directory specified
 
| -path <path> || look for executables in the directory specified
 
|-
 
|-
| -q or –quiet || suppress helpful messages
+
| -q or -quiet || suppress helpful messages
 +
|-
 +
| -output-filename <name> || redirect output into the file <name>.<rank>
 
|-
 
|-
| -output-filename <name> || redirects output into the file <name>.<rank>
+
| -x <env_variable> || export the specified environment variable to the remote nodes where the program will be executed
 
|-
 
|-
| --help || lists all options available with an explanation
+
| --help || list all options available with an explanation
 
|}
 
|}
 +
 +
=== Intel MPI ===
 +
 +
{| class="wikitable" style="width: 60%;"
 +
| Option || Function
 +
|-
 +
| -n <num_procs> || number of processes to run
 +
|-
 +
| -ppn <num_procs> || number of processes per node; for that to work, it may be necessary to set the environment variable <code>I_MPI_JOB_RESPECT_PROCESS_PLACEMENT=off</code>
 +
|-
 +
| -wdir <directory> || change to directory specified before executing the program
 +
|-
 +
| -path <path> || look for executables in the directory specified
 +
|-
 +
| -outfile-pattern <name> || redirect stdout to file
 +
|-
 +
| --help || list all options available with an explanation
 +
|}
 +
 +
=== Process Binding ===
 +
 +
Binding processes means telling your system how to place the processes onto the architecture. This can be done by adding command-line options when calling <code>mpiexec</code> and/or setting some environment variables (both verndor-specific), and may enhance (or kill) the performance of your application. In order to learn more about that, go here for [[Binding/Pinning#Options_for_Binding_in_Open_MPI|Open MPI]] and there for [[Binding/Pinning#Options_for_Binding_in_Intel_MPI|Intel MPI]].
 +
 +
In most MPI vendors the Process Binding is '''on''' by default nowadays. All MPIs assume there is only ''one'' MPI job running on hardware at any time. If you start more than one MPI job (e.g. in context of non-exclusive batch job in a batch sheduler w/o binding management, or just interactive tests), all '''N''' jobs would belive they are only children, ''and would bind themselves to the very same cores'', resulting in ''1/'''N''''' performance while remainig cores idling.
 +
 +
== References ==
 +
 +
[https://software.intel.com/content/www/us/en/develop/documentation/mpi-developer-reference-linux/top/command-reference/compilation-commands.html Intel MPI compiler options]
 +
 +
Manual page for [https://www.open-mpi.org/doc/v4.0/man1/mpiexec.1.php Open MPI's] and [https://software.intel.com/content/www/us/en/develop/documentation/mpi-developer-reference-linux/top/command-reference/mpirun.html Intel MPI's] mpiexec/mpirun command.

Latest revision as of 08:13, 12 June 2020

Basics

This page will give you a general overview of how to compile and execute a program that has been parallelized with MPI. Many of the options listed below are the same for both Open MPI and Intel MPI, however, be careful and look up if they indeed behave the same way.



How to Compile MPI Code

Before continuing, please make sure that the openmpi or intelmpi module is loaded (go here to see how to load/switch modules).

There are several so called MPI "compiler wrappers", e.g. mpicc. These take care of including the correct MPI libraries for the programming language you are using. But they share most command line options. Depending on whether your code is written in C, C++ or Fortran, follow the instructions in one of the tables below. Make sure to replace the arguments inside <…> with specific values.

Open MPI

Use the following command to specify the program you would like to compile (replace <src_file> with a path to your code, e. g. ./myprog.c).

Language Command
C $ mpicc <src_file> -o <name_of_executable>
C++ $ mpicxx <src_file> -o <name_of_executable>
Fortran $ mpifort <src_file> -o <name_of_executable>

You can also type the command $ mpicc [options], $ mpicxx [options] or $ mpifort [options]. There are a few options that come with Open MPI, however, options are more important for running your program. The compiler options might be useful to fetch more information about the Open MPI module you are using. Compile options unknown to the MPI compiler wrapper are simply forwarded to the underlying compiler e.g. icc.

Options Function
-showme:help print a short help message about the usage and lists all compiler options
-showme:version show Open MPI version

Instead of typing the compiler wrapper mpicc, mpicxx or mpifort explicitly, on most systems (e.g. the RWTH Compute Cluster) there are environment variables defined, which you can use to call the MPI compiler in a more general manner. Simple use $MPICC, $MPICXX or $MPIFC for the compiler you want to use and let the module system handle the dirty details of using the appropriate command.

Intel MPI

The Intel MPI is shipped with two sets of compiler wrappers (for GCC and for Intel compilers); it is rather important to use the right ones. Especially when using any software building tolls like CMake, double-check carefully about the right compiler wrappers being actually used, otherwise you will call the wrong compilers internally.

Use the following command to specify the program you would like to compile (replace <src_file> with a path to your code, e. g. ./myprog.c).

Compiler Driver C C++ Fortran
GCC $ mpicc <src_file> -o <name> $ mpicpc <src_file> -o <name> $ mpifort <src_file> -o <name>
Intel $ mpiicc <src_file> -o <name> $ mpiicpc <src_file> -o <name> $ mpiifort <src_file> -o <name>

You can also type the command $ mpicc [options] <src_file> -o <name> etc., where [options] can be replaced with one or more of the ones listed below. Intel MPI comes with rather advanced compiler options, that are mainly aimed at optimization and analyzing your code with the help of Intel tools.

Options Function
-g enable debugging information
-OX enable compiler optimization, where X represents the optimization level and is one of 0, 1, 2, 3
-v print the compiler version

Instead of typing the compiler wrapper mpicc, mpicxx or mpifort explicitly, on most systems (e.g. the RWTH Compute Cluster) there are environment variables defined, which you can use to call the MPI compiler in a more general manner. Simple use $MPICC, $MPICXX or $MPIFC for the compiler you want to use and let the module system handle the dirty details of using the appropriate command.

How to Run an MPI Executable

Ensure that the correct MPI module is loaded (go here to see how to load/switch modules). Once again, the command line options slightly differ between Intel MPI and Open MPI. In order to start any MPI program, type the following command where <executable> specifies the path to your application:

$ mpirun -n <num_procs> [options] <executable>

Note that mpiexec and mpirun are synonymous in Open MPI, in Intel MPI it's mpiexec.hydra and mpirun.

Don’t forget to put the -np or -n option as explained below. All the other options listed below are not mandatory.

Open MPI

Option Function
-np <num_procs> or -n <num_procs> number of processes to run
-npersocket <num_procs> number of processes per socket
-npernode <num_procs> number of processes per node
-wdir <directory> change to directory specified before executing the program
-path <path> look for executables in the directory specified
-q or -quiet suppress helpful messages
-output-filename <name> redirect output into the file <name>.<rank>
-x <env_variable> export the specified environment variable to the remote nodes where the program will be executed
--help list all options available with an explanation

Intel MPI

Option Function
-n <num_procs> number of processes to run
-ppn <num_procs> number of processes per node; for that to work, it may be necessary to set the environment variable I_MPI_JOB_RESPECT_PROCESS_PLACEMENT=off
-wdir <directory> change to directory specified before executing the program
-path <path> look for executables in the directory specified
-outfile-pattern <name> redirect stdout to file
--help list all options available with an explanation

Process Binding

Binding processes means telling your system how to place the processes onto the architecture. This can be done by adding command-line options when calling mpiexec and/or setting some environment variables (both verndor-specific), and may enhance (or kill) the performance of your application. In order to learn more about that, go here for Open MPI and there for Intel MPI.

In most MPI vendors the Process Binding is on by default nowadays. All MPIs assume there is only one MPI job running on hardware at any time. If you start more than one MPI job (e.g. in context of non-exclusive batch job in a batch sheduler w/o binding management, or just interactive tests), all N jobs would belive they are only children, and would bind themselves to the very same cores, resulting in 1/N performance while remainig cores idling.

References

Intel MPI compiler options

Manual page for Open MPI's and Intel MPI's mpiexec/mpirun command.