Difference between revisions of "Building LLVM/Clang with OpenMP Offloading to NVIDIA GPUs"

From HPC Wiki
Jump to navigation Jump to search
m
Line 1: Line 1:
 
[[Category:HPC-Developer]]
 
[[Category:HPC-Developer]]
 +
<!--
 
Clang 7.0, released in September 2018, has support for offloading to NVIDIA GPUs.
 
Clang 7.0, released in September 2018, has support for offloading to NVIDIA GPUs.
 
These instructions will guide you through the process of building the Clang compiler on Linux.
 
These instructions will guide you through the process of building the Clang compiler on Linux.
 
While this page refers to version 7.0, it should be applicable (with possibly minor adaptions) to later versions.
 
While this page refers to version 7.0, it should be applicable (with possibly minor adaptions) to later versions.
 
It's recommended to get the latest release from https://releases.llvm.org/!
 
It's recommended to get the latest release from https://releases.llvm.org/!
 
+
-->
 +
This guide describes how to build the Clang compiler with OpenMP support for offloading computational task to Nvidia GPUs. A working Linux environment with GCC (8.3.0) and CMake (3.15.6) is assumed for the build process. LLVM/Clang ([https://github.com/llvm/llvm-project/releases 10.0.0] or later) is recommended, because some bugs relevant to OpenMP GPU-Offloading were found in earlier versions of LLVM/Clang in [https://github.com/pc2/OMP-Offloading our tests].
 +
<!--
 
== Determine GPU Architectures ==
 
== Determine GPU Architectures ==
  
Line 14: Line 17:
 
A clearly structured table can be found on [https://en.wikipedia.org/wiki/CUDA#GPUs_supported Wikpedia] or in NVIDIA's [https://developer.nvidia.com/cuda-gpus developer documentation].
 
A clearly structured table can be found on [https://en.wikipedia.org/wiki/CUDA#GPUs_supported Wikpedia] or in NVIDIA's [https://developer.nvidia.com/cuda-gpus developer documentation].
 
As an example, the "Tesla P100" has compute capability 6.0 while the more recent Volta GPU "Tesla V100" is listed with 7.0.
 
As an example, the "Tesla P100" has compute capability 6.0 while the more recent Volta GPU "Tesla V100" is listed with 7.0.
 +
-->
 +
 +
== Determine GPU(s) on Compute Node ==
 +
 +
First of all, we need to determine whether the GPU(s) on a compute node can be correctly identified by using the command <code>nvidia-smi</code>. As an example, the output below shows two Nvidia RTX 2080 Ti GPUs on one compute node in the OCuLUS system at [https://pc2.uni-paderborn.de/ Paderborn Center for Parallel Computing], Paderborn University, Germany.
 +
 +
<pre>
 +
+-----------------------------------------------------------------------------+
 +
| NVIDIA-SMI 440.33.01    Driver Version: 440.33.01    CUDA Version: 10.2    |
 +
|-------------------------------+----------------------+----------------------+
 +
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
 +
| Fan  Temp  Perf  Pwr:Usage/Cap|        Memory-Usage | GPU-Util  Compute M. |
 +
|===============================+======================+======================|
 +
|  0  GeForce RTX 208...  Off  | 00000000:03:00.0 Off |                  N/A |
 +
| 31%  35C    P0    64W / 250W |      0MiB / 11019MiB |      0%      Default |
 +
+-------------------------------+----------------------+----------------------+
 +
|  1  GeForce RTX 208...  Off  | 00000000:84:00.0 Off |                  N/A |
 +
| 35%  34C    P0    35W / 250W |      0MiB / 11019MiB |      0%      Default |
 +
+-------------------------------+----------------------+----------------------+
 +
 +
+-----------------------------------------------------------------------------+
 +
| Processes:                                                      GPU Memory |
 +
|  GPU      PID  Type  Process name                            Usage      |
 +
|=============================================================================|
 +
|  No running processes found                                                |
 +
+-----------------------------------------------------------------------------+
 +
</pre>
  
 +
As can be seen, the Nvidia driver version is 440.33.01 and CUDA version is 10.2. Then, we're ready to build LLVM/Clang with OpenMP supporting for GPU-offloading.
 +
<!--
 
== Install Prerequisites ==
 
== Install Prerequisites ==
  
Line 64: Line 96:
 
</syntaxhighlight>
 
</syntaxhighlight>
 
Again the last step is optional if you are skipping <code>compiler-rt</code>.
 
Again the last step is optional if you are skipping <code>compiler-rt</code>.
 +
-->
 +
 +
== Download LLVM/Clang (10.0.0 or later) ==
 +
 +
LLVM/Clang (10.0.0) can be obtained by running:
 +
 +
<syntaxhighlight lang="bash">
 +
curl -Ls https://github.com/llvm/llvm-project/archive/llvmorg-10.0.0.tar.gz | tar zxf -
 +
</syntaxhighlight>
 +
 +
Whereas the latest version on GitHub can be downloaded by running:
 +
 +
<syntaxhighlight lang="bash">
 +
git clone https://github.com/llvm/llvm-project.git
 +
</syntaxhighlight>
  
 
== Build the Compiler ==
 
== Build the Compiler ==
 
+
<!--
 
With the sources in place let's proceed to configure and build the compiler.
 
With the sources in place let's proceed to configure and build the compiler.
 
Projects using CMake are usually built in a separate directory:
 
Projects using CMake are usually built in a separate directory:
Line 166: Line 213:
  
 
This should give you some <code>libomptarget-nvptx-sm_??.bc</code> libraries as mentioned in the warning message.
 
This should give you some <code>libomptarget-nvptx-sm_??.bc</code> libraries as mentioned in the warning message.
 +
-->
 +
To support OpenMP GPU-offloading two building steps for LLVM/Clang are required: first compile LLVM/Clang with GCC and then bootstrap LLVM/Clang itself.
 +
 +
=== Build LLVM/Clang with GCC ===
 +
 +
The following commands can be used to compile and install Clang as well as necessary libraries. See https://llvm.org/docs/ for the explanation of the cmake options.
 +
<pre>
 +
cmake                                                                          \
 +
  -DLLVM_ENABLE_PROJECTS="clang;clang-tools-extra;libcxx;libcxxabi;lld;openmp" \
 +
  -DCMAKE_BUILD_TYPE=Release                                                  \
 +
  -DLLVM_TARGETS_TO_BUILD="X86;NVPTX"                                          \
 +
  -DLLVM_ENABLE_ASSERTIONS=ON                                                  \
 +
  -DLLVM_ENABLE_BACKTRACES=ON                                                  \
 +
  -DLLVM_ENABLE_WERROR=OFF                                                    \
 +
  -DBUILD_SHARED_LIBS=OFF                                                      \
 +
  -DLLVM_ENABLE_RTTI=ON                                                        \
 +
  -DCLANG_OPENMP_NVPTX_DEFAULT_ARCH=sm_61                                      \
 +
  -DLIBOMPTARGET_NVPTX_COMPUTE_CAPABILITIES=35,37,50,52,60,61,70,75            \
 +
  -DCMAKE_C_COMPILER=gcc                                                      \
 +
  -DCMAKE_CXX_COMPILER=g++                                                    \
 +
  -G "Unix Makefiles" the-llvm-project-directory/llvm
 +
make -j 64
 +
make install
 +
</pre>
 +
 +
=== Bootstrap LLVM/Clang ===
 +
 +
The following commands can be used to bootstrap Clang by itself. Please note GNU's libstdc++ (instead of libc++ from LLVM) is used during linking.
 +
<pre>
 +
cmake                                                                          \
 +
  -DLLVM_ENABLE_PROJECTS="clang;clang-tools-extra;libcxx;libcxxabi;lld;openmp" \
 +
  -DCMAKE_BUILD_TYPE=Release                                                  \
 +
  -DLLVM_TARGETS_TO_BUILD="X86;NVPTX"                                          \
 +
  -DLLVM_ENABLE_ASSERTIONS=ON                                                  \
 +
  -DLLVM_ENABLE_BACKTRACES=ON                                                  \
 +
  -DLLVM_ENABLE_WERROR=OFF                                                    \
 +
  -DBUILD_SHARED_LIBS=OFF                                                      \
 +
  -DLLVM_ENABLE_RTTI=ON                                                        \
 +
  -DCLANG_OPENMP_NVPTX_DEFAULT_ARCH=sm_61                                      \
 +
  -DLIBOMPTARGET_NVPTX_COMPUTE_CAPABILITIES=35,37,50,52,60,61,70,75            \
 +
  -DCMAKE_C_COMPILER=clang                                                    \
 +
  -DCMAKE_CXX_COMPILER=clang++                                                \
 +
  -G "Unix Makefiles" the-llvm-project-directory/llvm
 +
make -j 64
 +
make install
 +
</pre>
  
 
== Done ==
 
== Done ==
  
 +
Now, we have successfully installed the Clang compiler with OpenMP GPU-offloading support. Code samples of OpenMP GPU-offloading and more information can be found at https://github.com/pc2/OMP-Offloading.
 +
 +
<!--
 
Following the instructions up to this point you should now have a fully working Clang compiler with support for OpenMP offloading!
 
Following the instructions up to this point you should now have a fully working Clang compiler with support for OpenMP offloading!
  
 
<span style="font-size:85%;">This guide was originally published as a blog post: https://www.hahnjo.de/blog/2018/10/08/clang-7.0-openmp-offloading-nvidia.html</span>
 
<span style="font-size:85%;">This guide was originally published as a blog post: https://www.hahnjo.de/blog/2018/10/08/clang-7.0-openmp-offloading-nvidia.html</span>
 +
-->

Revision as of 23:28, 27 March 2020

This guide describes how to build the Clang compiler with OpenMP support for offloading computational task to Nvidia GPUs. A working Linux environment with GCC (8.3.0) and CMake (3.15.6) is assumed for the build process. LLVM/Clang (10.0.0 or later) is recommended, because some bugs relevant to OpenMP GPU-Offloading were found in earlier versions of LLVM/Clang in our tests.

Determine GPU(s) on Compute Node

First of all, we need to determine whether the GPU(s) on a compute node can be correctly identified by using the command nvidia-smi. As an example, the output below shows two Nvidia RTX 2080 Ti GPUs on one compute node in the OCuLUS system at Paderborn Center for Parallel Computing, Paderborn University, Germany.

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 440.33.01    Driver Version: 440.33.01    CUDA Version: 10.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce RTX 208...  Off  | 00000000:03:00.0 Off |                  N/A |
| 31%   35C    P0    64W / 250W |      0MiB / 11019MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   1  GeForce RTX 208...  Off  | 00000000:84:00.0 Off |                  N/A |
| 35%   34C    P0    35W / 250W |      0MiB / 11019MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

As can be seen, the Nvidia driver version is 440.33.01 and CUDA version is 10.2. Then, we're ready to build LLVM/Clang with OpenMP supporting for GPU-offloading.

Download LLVM/Clang (10.0.0 or later)

LLVM/Clang (10.0.0) can be obtained by running:

curl -Ls https://github.com/llvm/llvm-project/archive/llvmorg-10.0.0.tar.gz | tar zxf -

Whereas the latest version on GitHub can be downloaded by running:

git clone https://github.com/llvm/llvm-project.git

Build the Compiler

To support OpenMP GPU-offloading two building steps for LLVM/Clang are required: first compile LLVM/Clang with GCC and then bootstrap LLVM/Clang itself.

Build LLVM/Clang with GCC

The following commands can be used to compile and install Clang as well as necessary libraries. See https://llvm.org/docs/ for the explanation of the cmake options.

cmake                                                                          \
  -DLLVM_ENABLE_PROJECTS="clang;clang-tools-extra;libcxx;libcxxabi;lld;openmp" \
  -DCMAKE_BUILD_TYPE=Release                                                   \
  -DLLVM_TARGETS_TO_BUILD="X86;NVPTX"                                          \
  -DLLVM_ENABLE_ASSERTIONS=ON                                                  \
  -DLLVM_ENABLE_BACKTRACES=ON                                                  \
  -DLLVM_ENABLE_WERROR=OFF                                                     \
  -DBUILD_SHARED_LIBS=OFF                                                      \
  -DLLVM_ENABLE_RTTI=ON                                                        \
  -DCLANG_OPENMP_NVPTX_DEFAULT_ARCH=sm_61                                      \
  -DLIBOMPTARGET_NVPTX_COMPUTE_CAPABILITIES=35,37,50,52,60,61,70,75            \
  -DCMAKE_C_COMPILER=gcc                                                       \
  -DCMAKE_CXX_COMPILER=g++                                                     \
  -G "Unix Makefiles" the-llvm-project-directory/llvm
make -j 64
make install

Bootstrap LLVM/Clang

The following commands can be used to bootstrap Clang by itself. Please note GNU's libstdc++ (instead of libc++ from LLVM) is used during linking.

cmake                                                                          \
  -DLLVM_ENABLE_PROJECTS="clang;clang-tools-extra;libcxx;libcxxabi;lld;openmp" \
  -DCMAKE_BUILD_TYPE=Release                                                   \
  -DLLVM_TARGETS_TO_BUILD="X86;NVPTX"                                          \
  -DLLVM_ENABLE_ASSERTIONS=ON                                                  \
  -DLLVM_ENABLE_BACKTRACES=ON                                                  \
  -DLLVM_ENABLE_WERROR=OFF                                                     \
  -DBUILD_SHARED_LIBS=OFF                                                      \
  -DLLVM_ENABLE_RTTI=ON                                                        \
  -DCLANG_OPENMP_NVPTX_DEFAULT_ARCH=sm_61                                      \
  -DLIBOMPTARGET_NVPTX_COMPUTE_CAPABILITIES=35,37,50,52,60,61,70,75            \
  -DCMAKE_C_COMPILER=clang                                                     \
  -DCMAKE_CXX_COMPILER=clang++                                                 \
  -G "Unix Makefiles" the-llvm-project-directory/llvm
make -j 64
make install

Done

Now, we have successfully installed the Clang compiler with OpenMP GPU-offloading support. Code samples of OpenMP GPU-offloading and more information can be found at https://github.com/pc2/OMP-Offloading.