Building LLVM/Clang with OpenMP Offloading to NVIDIA GPUs
This guide describes how to build the Clang compiler with OpenMP support for offloading computational task to Nvidia GPUs. A working Linux environment with GCC (8.3.0) and CMake (3.15.6) is assumed for the build process. LLVM/Clang (10.0.0 or later) is recommended, because some bugs relevant to OpenMP GPU-Offloading were found in earlier versions of LLVM/Clang in our tests.
Determine GPU(s) on Compute Node
First of all, we need to determine whether the GPU(s) on a compute node can be correctly identified by using the command nvidia-smi
. As an example, the output below shows two Nvidia RTX 2080 Ti GPUs on one compute node in the OCuLUS system at Paderborn Center for Parallel Computing, Paderborn University, Germany.
+-----------------------------------------------------------------------------+ | NVIDIA-SMI 440.33.01 Driver Version: 440.33.01 CUDA Version: 10.2 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | |===============================+======================+======================| | 0 GeForce RTX 208... Off | 00000000:03:00.0 Off | N/A | | 31% 35C P0 64W / 250W | 0MiB / 11019MiB | 0% Default | +-------------------------------+----------------------+----------------------+ | 1 GeForce RTX 208... Off | 00000000:84:00.0 Off | N/A | | 35% 34C P0 35W / 250W | 0MiB / 11019MiB | 0% Default | +-------------------------------+----------------------+----------------------+ +-----------------------------------------------------------------------------+ | Processes: GPU Memory | | GPU PID Type Process name Usage | |=============================================================================| | No running processes found | +-----------------------------------------------------------------------------+
As can be seen, the Nvidia driver version is 440.33.01 and CUDA version is 10.2. Then, we're ready to build LLVM/Clang with OpenMP support for GPU-offloading.
Download LLVM/Clang (10.0.0 or later)
LLVM/Clang (10.0.0) can be obtained by running:
curl -Ls https://github.com/llvm/llvm-project/archive/llvmorg-10.0.0.tar.gz | tar zxf -
Whereas the latest source code on GitHub can be downloaded by running:
git clone https://github.com/llvm/llvm-project.git
Build the Compiler
To support OpenMP GPU-offloading two building steps for LLVM/Clang are required: first compile LLVM/Clang with GCC and then bootstrap LLVM/Clang itself.
Build LLVM/Clang with GCC
The following commands can be used to compile and install the Clang compiler, as well as some other libraries. See https://llvm.org/docs/ for the explanation of the cmake options.
cmake \ -DLLVM_ENABLE_PROJECTS="clang;clang-tools-extra;libcxx;libcxxabi;lld;openmp" \ -DCMAKE_BUILD_TYPE=Release \ -DLLVM_TARGETS_TO_BUILD="X86;NVPTX" \ -DCLANG_OPENMP_NVPTX_DEFAULT_ARCH=sm_61 \ -DLIBOMPTARGET_NVPTX_COMPUTE_CAPABILITIES=35,37,50,52,60,61,70,75 \ -DCMAKE_C_COMPILER=gcc \ -DCMAKE_CXX_COMPILER=g++ \ -G "Unix Makefiles" the-llvm-project-directory/llvm make -j 16 make install
Bootstrap LLVM/Clang
The following commands can be used to bootstrap Clang by itself. Please note GNU's libstdc++ (instead of libc++ from LLVM) is used during linking.
cmake \ -DLLVM_ENABLE_PROJECTS="clang;clang-tools-extra;libcxx;libcxxabi;lld;openmp" \ -DCMAKE_BUILD_TYPE=Release \ -DLLVM_TARGETS_TO_BUILD="X86;NVPTX" \ -DCLANG_OPENMP_NVPTX_DEFAULT_ARCH=sm_61 \ -DLIBOMPTARGET_NVPTX_COMPUTE_CAPABILITIES=35,37,50,52,60,61,70,75 \ -DCMAKE_C_COMPILER=clang \ -DCMAKE_CXX_COMPILER=clang++ \ -G "Unix Makefiles" the-llvm-project-directory/llvm make -j 16 make install
Done
Now, we have successfully installed the Clang compiler with OpenMP GPU-offloading support. Code samples of OpenMP GPU-offloading and more information can be found on https://github.com/pc2/OMP-Offloading.