Difference between revisions of "How to Use MPI"

From HPC Wiki
Jump to navigation Jump to search
 
(14 intermediate revisions by 4 users not shown)
Line 1: Line 1:
 +
[[Category:HPC-User]]
 
== Basics ==
 
== Basics ==
  
This page will give you a general overview of how to compile and execute a program that has been [[Parallel_Programming|parallelized]] with [[MPI]]. Many of the options listed below are the same for both Open MPI and Intel MPI, however, be careful if they do differentiate.
+
This page will give you a general overview of how to compile and execute a program that has been [[Parallel_Programming|parallelized]] with [[MPI]]. Many of the options listed below are the same for both Open MPI and Intel MPI, however, be careful and look up if they indeed behave the same way.
 +
 
 +
 
 +
__TOC__
 +
 
  
 
== How to Compile MPI Code ==
 
== How to Compile MPI Code ==
Line 31: Line 36:
 
|}
 
|}
  
For RWTH cluster users:
+
Instead of typing the compiler wrapper <code>mpicc</code>, <code>mpicxx</code> or <code>mpifort</code> explicitly, on most systems (e.g. the RWTH Compute Cluster) there are environment variables defined, which you can use to call the MPI compiler in a more general manner. Simple use <code>$MPICC</code>, <code>$MPICXX</code> or <code>$MPIFC</code> for the compiler you want to use and let the module system handle the dirty details of using the appropriate command.
Instead of typing the compiler wrapper <code>mpicc</code> etc., you can simply put one of the environment variables <code>$MPICC</code>, <code>$MPICXX</code> or <code>$MPIFC</code> for Fortran codes. They are already set by the module system so that you do not have to worry about which compiler module to use.
 
  
 
=== Intel MPI ===
 
=== Intel MPI ===
 +
The Intel MPI is shipped with ''two'' sets of compiler wrappers (for GCC and for Intel compilers); it is rather important to use the right ones. Especially when using any software building tolls like [[Cmake|CMake]], double-check carefully about the right compiler wrappers being actually used, otherwise you will call the wrong compilers internally.
  
 
Use the following command to specify the program you would like to compile (replace <code><src_file></code> with a path to your code, e. g. <code>./myprog.c</code>).  
 
Use the following command to specify the program you would like to compile (replace <code><src_file></code> with a path to your code, e. g. <code>./myprog.c</code>).  
{| class="wikitable" style="width: 40%;"
+
{| class="wikitable" style="width: 70%;"
 
| Compiler Driver || C || C++ || Fortran
 
| Compiler Driver || C || C++ || Fortran
 
|-
 
|-
Line 46: Line 51:
  
 
You can also type the command <code>$ mpicc [options] <src_file> -o <name></code> etc., where <code>[options]</code> can be replaced with one or more of the ones listed below. Intel MPI comes with rather advanced compiler options, that are mainly aimed at optimization and analyzing your code with the help of Intel tools.
 
You can also type the command <code>$ mpicc [options] <src_file> -o <name></code> etc., where <code>[options]</code> can be replaced with one or more of the ones listed below. Intel MPI comes with rather advanced compiler options, that are mainly aimed at optimization and analyzing your code with the help of Intel tools.
{| class="wikitable" style="width: 40%;"
+
{| class="wikitable" style="width: 70%;"
 
|Options || Function
 
|Options || Function
 
|-
 
|-
 
| -g || enable debugging information
 
| -g || enable debugging information
 
|-  
 
|-  
| -OX || enable compiler optimization, where <code>X</code> represents the optimization level and is one of (0, 1, 2, 3)
+
| -OX || enable compiler optimization, where <code>X</code> represents the optimization level and is one of 0, 1, 2, 3
 
|-
 
|-
 
| -v || print the compiler version
 
| -v || print the compiler version
 
|}
 
|}
  
For RWTH cluster users:
+
Instead of typing the compiler wrapper <code>mpicc</code>, <code>mpicxx</code> or <code>mpifort</code> explicitly, on most systems (e.g. the RWTH Compute Cluster) there are environment variables defined, which you can use to call the MPI compiler in a more general manner. Simple use <code>$MPICC</code>, <code>$MPICXX</code> or <code>$MPIFC</code> for the compiler you want to use and let the module system handle the dirty details of using the appropriate command.
Instead of typing the compiler wrapper <code>mpicc</code> etc., you can simply put one of the environment variables <code>$MPICC</code>, <code>$MPICXX</code> or <code>$MPIFC</code> for Fortran codes. They are already set by the module system so that you do not have to worry about which compiler module to use.
 
  
 
== How to Run an MPI Executable ==
 
== How to Run an MPI Executable ==
Line 80: Line 84:
 
|-
 
|-
 
| -wdir <directory> || change to directory specified before executing the program
 
| -wdir <directory> || change to directory specified before executing the program
|-
 
| -nw || complete command when all MPI processes have been launched successfully
 
 
|-
 
|-
 
| -path <path> || look for executables in the directory specified
 
| -path <path> || look for executables in the directory specified
Line 112: Line 114:
 
|}
 
|}
  
=== Process Binding in Open MPI ===
+
=== Process Binding ===
 +
 
 +
Binding processes means telling your system how to place the processes onto the architecture. This can be done by adding command-line options when calling <code>mpiexec</code> and/or setting some environment variables (both verndor-specific), and may enhance (or kill) the performance of your application. In order to learn more about that, go here for [[Binding/Pinning#Options_for_Binding_in_Open_MPI|Open MPI]] and there for [[Binding/Pinning#Options_for_Binding_in_Intel_MPI|Intel MPI]].
  
Binding processes means telling your system how to place the processes onto the architecture. This can be done by adding command-line options when calling <code>mpiexec</code> and may enhance the performance of your application. In order to learn more about that, go [[Binding/Pinning#Options_for_Binding_in_Open_MPI|here]].
+
In most MPI vendors the Process Binding is '''on''' by default nowadays. All MPIs assume there is only ''one'' MPI job running on hardware at any time. If you start more than one MPI job (e.g. in context of non-exclusive batch job in a batch sheduler w/o binding management, or just interactive tests), all '''N''' jobs would belive they are only children, ''and would bind themselves to the very same cores'', resulting in ''1/'''N''''' performance while remainig cores idling.
  
 
== References ==
 
== References ==
  
[https://software.intel.com/en-us/mpi-developer-reference-linux-compiler-command-options Intel MPI compiler options]
+
[https://software.intel.com/content/www/us/en/develop/documentation/mpi-developer-reference-linux/top/command-reference/compilation-commands.html Intel MPI compiler options]
  
[https://www.open-mpi.org/doc/v2.0/man1/mpiexec.1.php Manual page for Open MPI's mpiexec]
+
Manual page for [https://www.open-mpi.org/doc/v4.0/man1/mpiexec.1.php Open MPI's] and [https://software.intel.com/content/www/us/en/develop/documentation/mpi-developer-reference-linux/top/command-reference/mpirun.html Intel MPI's] mpiexec/mpirun command.

Latest revision as of 09:13, 12 June 2020

Basics

This page will give you a general overview of how to compile and execute a program that has been parallelized with MPI. Many of the options listed below are the same for both Open MPI and Intel MPI, however, be careful and look up if they indeed behave the same way.



How to Compile MPI Code

Before continuing, please make sure that the openmpi or intelmpi module is loaded (go here to see how to load/switch modules).

There are several so called MPI "compiler wrappers", e.g. mpicc. These take care of including the correct MPI libraries for the programming language you are using. But they share most command line options. Depending on whether your code is written in C, C++ or Fortran, follow the instructions in one of the tables below. Make sure to replace the arguments inside <…> with specific values.

Open MPI

Use the following command to specify the program you would like to compile (replace <src_file> with a path to your code, e. g. ./myprog.c).

Language Command
C $ mpicc <src_file> -o <name_of_executable>
C++ $ mpicxx <src_file> -o <name_of_executable>
Fortran $ mpifort <src_file> -o <name_of_executable>

You can also type the command $ mpicc [options], $ mpicxx [options] or $ mpifort [options]. There are a few options that come with Open MPI, however, options are more important for running your program. The compiler options might be useful to fetch more information about the Open MPI module you are using. Compile options unknown to the MPI compiler wrapper are simply forwarded to the underlying compiler e.g. icc.

Options Function
-showme:help print a short help message about the usage and lists all compiler options
-showme:version show Open MPI version

Instead of typing the compiler wrapper mpicc, mpicxx or mpifort explicitly, on most systems (e.g. the RWTH Compute Cluster) there are environment variables defined, which you can use to call the MPI compiler in a more general manner. Simple use $MPICC, $MPICXX or $MPIFC for the compiler you want to use and let the module system handle the dirty details of using the appropriate command.

Intel MPI

The Intel MPI is shipped with two sets of compiler wrappers (for GCC and for Intel compilers); it is rather important to use the right ones. Especially when using any software building tolls like CMake, double-check carefully about the right compiler wrappers being actually used, otherwise you will call the wrong compilers internally.

Use the following command to specify the program you would like to compile (replace <src_file> with a path to your code, e. g. ./myprog.c).

Compiler Driver C C++ Fortran
GCC $ mpicc <src_file> -o <name> $ mpicpc <src_file> -o <name> $ mpifort <src_file> -o <name>
Intel $ mpiicc <src_file> -o <name> $ mpiicpc <src_file> -o <name> $ mpiifort <src_file> -o <name>

You can also type the command $ mpicc [options] <src_file> -o <name> etc., where [options] can be replaced with one or more of the ones listed below. Intel MPI comes with rather advanced compiler options, that are mainly aimed at optimization and analyzing your code with the help of Intel tools.

Options Function
-g enable debugging information
-OX enable compiler optimization, where X represents the optimization level and is one of 0, 1, 2, 3
-v print the compiler version

Instead of typing the compiler wrapper mpicc, mpicxx or mpifort explicitly, on most systems (e.g. the RWTH Compute Cluster) there are environment variables defined, which you can use to call the MPI compiler in a more general manner. Simple use $MPICC, $MPICXX or $MPIFC for the compiler you want to use and let the module system handle the dirty details of using the appropriate command.

How to Run an MPI Executable

Ensure that the correct MPI module is loaded (go here to see how to load/switch modules). Once again, the command line options slightly differ between Intel MPI and Open MPI. In order to start any MPI program, type the following command where <executable> specifies the path to your application:

$ mpirun -n <num_procs> [options] <executable>

Note that mpiexec and mpirun are synonymous in Open MPI, in Intel MPI it's mpiexec.hydra and mpirun.

Don’t forget to put the -np or -n option as explained below. All the other options listed below are not mandatory.

Open MPI

Option Function
-np <num_procs> or -n <num_procs> number of processes to run
-npersocket <num_procs> number of processes per socket
-npernode <num_procs> number of processes per node
-wdir <directory> change to directory specified before executing the program
-path <path> look for executables in the directory specified
-q or -quiet suppress helpful messages
-output-filename <name> redirect output into the file <name>.<rank>
-x <env_variable> export the specified environment variable to the remote nodes where the program will be executed
--help list all options available with an explanation

Intel MPI

Option Function
-n <num_procs> number of processes to run
-ppn <num_procs> number of processes per node; for that to work, it may be necessary to set the environment variable I_MPI_JOB_RESPECT_PROCESS_PLACEMENT=off
-wdir <directory> change to directory specified before executing the program
-path <path> look for executables in the directory specified
-outfile-pattern <name> redirect stdout to file
--help list all options available with an explanation

Process Binding

Binding processes means telling your system how to place the processes onto the architecture. This can be done by adding command-line options when calling mpiexec and/or setting some environment variables (both verndor-specific), and may enhance (or kill) the performance of your application. In order to learn more about that, go here for Open MPI and there for Intel MPI.

In most MPI vendors the Process Binding is on by default nowadays. All MPIs assume there is only one MPI job running on hardware at any time. If you start more than one MPI job (e.g. in context of non-exclusive batch job in a batch sheduler w/o binding management, or just interactive tests), all N jobs would belive they are only children, and would bind themselves to the very same cores, resulting in 1/N performance while remainig cores idling.

References

Intel MPI compiler options

Manual page for Open MPI's and Intel MPI's mpiexec/mpirun command.