Building LLVM/Clang with OpenMP Offloading to NEC SX-Aurora VE TSUBASA

From HPC Wiki
Jump to: navigation, search

The current LLVM OpenMP runtime now contains code for OpenMP offloading to NEC's SX-Aurora Vector Engine. The RWTH Aachen HPC Group provides a patched LLVM source tree to built OpenMP programs that use the LLVM OpenMP runtime to offload to VE devices, using NEC's ncc as backend compiler. These instructions will guide you through the process of building the Clang compiler from this patched source tree on Linux.

Alternatively, NEC provides releases of an offloading enabled Clang with its llvm-ve-rv package.


Building LLVM requires some software

  • First you'll need some standard tools like make and git
  • For the build process a compiler already needs to be installed. Most Linux systems default to the GNU Compiler Collection (gcc). Please ensure that you have at least version 5.1 or refer to some online tutorials on how to install one for your system. If you happen to have an older installation of Clang, any version greater than version 3.5 should be fine.
  • Additionally LLVM requires a (more or less) recent CMake, at least version 3.13.4. If your distribution doesn't provide an adequate version, see on how to get it.
  • For the libomptarget plugin for VE, the system needs to libelf and its developemnt headers

Additionally, your system needs NEC's VEOS and the NEC SDK (including NEC's C/C++ compiler, at least version 3.0.1 and libveo/aveo, at least version 0.9.8). See on how to install them.

Obtain Sources

All necessary components for compiling and offloading are contained in the llvm monorepo fork To check out the latest version simply use

 $ git clone
 $ git checkout aurora-offloading-prototype

Build the Compiler

Once you have obtained to sources, you can proceed to configure and build the compiler. Projects using CMake are usually built in a seperate directory:

 $ mkdir build
 $ cd build

The next steps will be pretty IO-intensive, so it might be a good idea to put the build directory on a locally attached disk (or even an SSD).

Next CMake needs to generate Makefiles which will eventually be used for compilation:

cmake -DCMAKE_BUILD_TYPE=Release \
    -DCMAKE_INSTALL_PREFIX=$(pwd)/../install \
    -DLLVM_ENABLE_PROJECTS="clang;openmp" \

Of course you can use any other Generator that CMake supports.

The first two flags are standard for CMake projects: CMAKE_BUILD_TYPE=Release turns on optimizations and disables debug information. CMAKE_INSTALL_PREFIX specifies where the final binaries and libraries will be installed. Be sure to choose a permanent location if you are building in a temporary directory.

If everything went right you should see something like the following towards the end of the output:

-- Found LIBOMPTARGET_DEP_VEO: /opt/nec/ve/veos/lib64/  
-- LIBOMPTARGET: Building offloading runtime library libomptarget.
-- LIBOMPTARGET: Not building aarch64 offloading plugin: machine not found in the system.
-- LIBOMPTARGET: Not building CUDA offloading plugin: CUDA not found in system.
-- LIBOMPTARGET: Not building PPC64 offloading plugin: machine not found in the system.
-- LIBOMPTARGET: Not building PPC64le offloading plugin: machine not found in the system.
-- LIBOMPTARGET: Building SX-Aurora VE offloading plugin.

Now comes the time-consuming part:

 $ make -j8

Using the -j parameter (short for --jobs) you can allow make to run multiple commands concurrently. Usually the number of cores in your server is a reasonable choice which can speed up the compilation by a good deal.

Afterwards the built libraries and binaries need to be installed:

 $ make -j8 install


Following the instructions up to this point you should now have a fully working Clang compiler with support for OpenMP offloading to VE devices. You can now compiler code that offloads to VE devices with:

 $ clang -fopenmp -fopenmp-targets=aurora-nec-veort-unknown