Building LLVM/Clang with OpenMP Offloading to NVIDIA GPUs

This guide describes how to build the Clang compiler with OpenMP support for offloading computational task to Nvidia GPUs. A working Linux environment with GCC (8.3.0) and CMake (3.15.6) is assumed for the build process. LLVM/Clang (10.0.0 or later) is recommended, because some bugs relevant to OpenMP GPU-Offloading were found in earlier versions of LLVM/Clang in our tests.

Determine GPU(s) on Compute Node

First of all, we need to determine whether the GPU(s) on a compute node can be correctly identified by using the command nvidia-smi. As an example, the output below shows two Nvidia RTX 2080 Ti GPUs on one compute node in the OCuLUS system at Paderborn Center for Parallel Computing, Paderborn University, Germany.

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 440.33.01    Driver Version: 440.33.01    CUDA Version: 10.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce RTX 208...  Off  | 00000000:03:00.0 Off |                  N/A |
| 31%   35C    P0    64W / 250W |      0MiB / 11019MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   1  GeForce RTX 208...  Off  | 00000000:84:00.0 Off |                  N/A |
| 35%   34C    P0    35W / 250W |      0MiB / 11019MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

As can be seen, the Nvidia driver version is 440.33.01 and CUDA version is 10.2. Then, we're ready to build LLVM/Clang with OpenMP support for GPU-offloading.

Download LLVM/Clang (10.0.0 or later)

LLVM/Clang (10.0.0) can be obtained by running:

curl -Ls https://github.com/llvm/llvm-project/archive/llvmorg-10.0.0.tar.gz | tar zxf -

Whereas the latest source code on GitHub can be downloaded by running:

git clone https://github.com/llvm/llvm-project.git

Build the Compiler

To support OpenMP GPU-offloading two building steps for LLVM/Clang are required: first compile LLVM/Clang with GCC and then bootstrap LLVM/Clang itself.

Build LLVM/Clang with GCC

The following commands can be used to compile and install the Clang compiler, as well as some other libraries. See https://llvm.org/docs/ for the explanation of the cmake options.

cmake                                                                          \
  -DLLVM_ENABLE_PROJECTS="clang;clang-tools-extra;libcxx;libcxxabi;lld;openmp" \
  -DCMAKE_BUILD_TYPE=Release                                                   \
  -DLLVM_TARGETS_TO_BUILD="X86;NVPTX"                                          \
  -DLLVM_ENABLE_ASSERTIONS=ON                                                  \
  -DLLVM_ENABLE_BACKTRACES=ON                                                  \
  -DLLVM_ENABLE_WERROR=OFF                                                     \
  -DBUILD_SHARED_LIBS=OFF                                                      \
  -DLLVM_ENABLE_RTTI=ON                                                        \
  -DCLANG_OPENMP_NVPTX_DEFAULT_ARCH=sm_61                                      \
  -DLIBOMPTARGET_NVPTX_COMPUTE_CAPABILITIES=35,37,50,52,60,61,70,75            \
  -DCMAKE_C_COMPILER=gcc                                                       \
  -DCMAKE_CXX_COMPILER=g++                                                     \
  -G "Unix Makefiles" the-llvm-project-directory/llvm
make -j 16
make install

Bootstrap LLVM/Clang

The following commands can be used to bootstrap Clang by itself. Please note GNU's libstdc++ (instead of libc++ from LLVM) is used during linking.

cmake                                                                          \
  -DLLVM_ENABLE_PROJECTS="clang;clang-tools-extra;libcxx;libcxxabi;lld;openmp" \
  -DCMAKE_BUILD_TYPE=Release                                                   \
  -DLLVM_TARGETS_TO_BUILD="X86;NVPTX"                                          \
  -DLLVM_ENABLE_ASSERTIONS=ON                                                  \
  -DLLVM_ENABLE_BACKTRACES=ON                                                  \
  -DLLVM_ENABLE_WERROR=OFF                                                     \
  -DBUILD_SHARED_LIBS=OFF                                                      \
  -DLLVM_ENABLE_RTTI=ON                                                        \
  -DCLANG_OPENMP_NVPTX_DEFAULT_ARCH=sm_61                                      \
  -DLIBOMPTARGET_NVPTX_COMPUTE_CAPABILITIES=35,37,50,52,60,61,70,75            \
  -DCMAKE_C_COMPILER=clang                                                     \
  -DCMAKE_CXX_COMPILER=clang++                                                 \
  -G "Unix Makefiles" the-llvm-project-directory/llvm
make -j 16
make install

Done

Now, we have successfully installed the Clang compiler with OpenMP GPU-offloading support. Code samples of OpenMP GPU-offloading and more information can be found on https://github.com/pc2/OMP-Offloading.