Difference between revisions of "How to Use OpenMP"
(8 intermediate revisions by 3 users not shown) | |||
Line 1: | Line 1: | ||
+ | [[Category:HPC-User]] | ||
== Basics == | == Basics == | ||
− | This will give you a general overview of how to compile and execute a program that has been [[Parallel_Programming|parallelized]] with [[OpenMP]]. | + | This page will give you a general overview of how to compile and execute a program that has been [[Parallel_Programming|parallelized]] with [[OpenMP]]. |
− | As opposed to [[How_to_Use_MPI|MPI]], you do not have to load any modules to use OpenMP. | + | As opposed to [[How_to_Use_MPI|MPI]], you do not have to load any modules to use OpenMP (but your compiler must support OpenMP - most of the compilers do it). |
+ | |||
+ | |||
+ | __TOC__ | ||
== How to Compile OpenMP Code == | == How to Compile OpenMP Code == | ||
− | Additional compiler flags tell the compiler to enable OpenMP. Otherwise, the OpenMP pragmas in the code will be ignored by the compiler. | + | Additional vendor-specific (and sometimes version-specific) compiler flags tell the compiler to enable OpenMP. Otherwise, the OpenMP pragmas in the code will be ignored by the compiler. |
Depending on which compiler you have loaded, use one of the flags below to compile your code. | Depending on which compiler you have loaded, use one of the flags below to compile your code. | ||
Line 13: | Line 17: | ||
| Compiler || Flag | | Compiler || Flag | ||
|- | |- | ||
− | | GNU || -fopenmp | + | | GNU || <code>-fopenmp</code> |
+ | |- | ||
+ | | Intel || <code>-qopenmp</code> | ||
+ | |- | ||
+ | | Clang || <code>-fopenmp</code> | ||
|- | |- | ||
− | | | + | | Oracle || <code>-xopenmp</code> |
|- | |- | ||
− | | | + | | NAG Fortran || <code>-openmp</code> |
|} | |} | ||
− | For example: if you plan to use an Intel compiler for your OpenMP code written in C, you have to type this to create an application called | + | For example: if you plan to use an Intel compiler for your OpenMP code written in C, you have to type this to create an application called <code>omp_code.exe</code>: |
$ icc -qopenmp omp_code.c -o omp_code.exe | $ icc -qopenmp omp_code.c -o omp_code.exe | ||
== How to Run an OpenMP Application == | == How to Run an OpenMP Application == | ||
− | === Setting OMP_NUM_THREADS === | + | === Setting <code>OMP_NUM_THREADS</code> === |
− | If you forget to set OMP_NUM_THREADS to any value, the default value of your cluster environment will be used. In | + | If you forget to set <code>OMP_NUM_THREADS</code> to any value, the default value of your cluster environment will be used. In many cases, the default is ''1'', so that your program is executed serially. If this envvar is not set at all the OpenMP run time may also deciede to use up ''all'' cores of your computer which must not always be the expected outcome, so it is a good idea always to set a meaningful value. |
One way to specify the number of threads is by passing an extra argument when running the executable file. In order to start the parallel regions of the example program above with 12 threads, you'd have to type: | One way to specify the number of threads is by passing an extra argument when running the executable file. In order to start the parallel regions of the example program above with 12 threads, you'd have to type: | ||
$ OMP_NUM_THREADS=12 ./omp_code.exe | $ OMP_NUM_THREADS=12 ./omp_code.exe | ||
− | This | + | This sets the environment variable <code>OMP_NUM_THREADS</code> to ''12'' for the execution time of <code>omp_code.exe</code> only, and it is reset to its default value after the execution of <code>omp_code.exe</code> finished. |
Another way to set the number of threads is by changing your environment variable. This example will increment it up to 24 threads and override the default value: | Another way to set the number of threads is by changing your environment variable. This example will increment it up to 24 threads and override the default value: |
Latest revision as of 08:17, 4 May 2020
Basics
This page will give you a general overview of how to compile and execute a program that has been parallelized with OpenMP. As opposed to MPI, you do not have to load any modules to use OpenMP (but your compiler must support OpenMP - most of the compilers do it).
How to Compile OpenMP Code
Additional vendor-specific (and sometimes version-specific) compiler flags tell the compiler to enable OpenMP. Otherwise, the OpenMP pragmas in the code will be ignored by the compiler.
Depending on which compiler you have loaded, use one of the flags below to compile your code.
Compiler | Flag |
GNU | -fopenmp
|
Intel | -qopenmp
|
Clang | -fopenmp
|
Oracle | -xopenmp
|
NAG Fortran | -openmp
|
For example: if you plan to use an Intel compiler for your OpenMP code written in C, you have to type this to create an application called omp_code.exe
:
$ icc -qopenmp omp_code.c -o omp_code.exe
How to Run an OpenMP Application
Setting OMP_NUM_THREADS
If you forget to set OMP_NUM_THREADS
to any value, the default value of your cluster environment will be used. In many cases, the default is 1, so that your program is executed serially. If this envvar is not set at all the OpenMP run time may also deciede to use up all cores of your computer which must not always be the expected outcome, so it is a good idea always to set a meaningful value.
One way to specify the number of threads is by passing an extra argument when running the executable file. In order to start the parallel regions of the example program above with 12 threads, you'd have to type:
$ OMP_NUM_THREADS=12 ./omp_code.exe
This sets the environment variable OMP_NUM_THREADS
to 12 for the execution time of omp_code.exe
only, and it is reset to its default value after the execution of omp_code.exe
finished.
Another way to set the number of threads is by changing your environment variable. This example will increment it up to 24 threads and override the default value:
$ export OMP_NUM_THREADS=24
If you simply run your application with $ ./omp_code.exe
next, this value will be used automatically.
Thread Pinning
The performance of your application may be improved depending on the distribution of threads. Go here to learn more about thread pinning in order to minimize the execution time.