Easybuild (Admin Guide)
EasyBuild (EB) is a Python framework to automate the process of software installations and the creation of environment modules in an HPC environment. Most installations are hereby directly compiled from source to create architecture optimized and performant builds. The following article shortly describes the first steps of how to set up EasyBuild. The main focus, however, lies on the description of a sustainable directory structure and configuration helping to keep an overview of installed software while beeing able to provide frequent updates of new versions. A more complete guide on how to set up EasyBuild can be found at the official documentation.
- /sw/easybuild/ is the main folder and accessible from all compute nodes
- /sw/easybuild/default will be the default path for your EB installation
- /sw/easybuild/stacks/skylake/2020a is a path for a specific CPU arch and software stack
- CentOS 7 (otherwise the list of dependencies for EB can vary)
- LMOD >= v8.3.4 (the current versions from the OpenHPC repos are not sufficient)
- Python 3 (You can also use Python 2, but at this point you really shouldn’t)
Initial installation and configuration
# INSTALL SYSTEM DEPS yum install epel-release yum install python3 git gcc gcc-c++ libibverbs-devel patch openssl-devel # DOWNLOAD THE BOOTSTRAP EB SCRIPT AND BOOTSTRAP EB curl -O https://raw.githubusercontent.com/easybuilders/easybuild-framework/develop/easybuild/scripts/bootstrap_eb.py python3 bootstrap_eb.py /sw/easybuild/default # UPDATE $MODULEPATH, AND LOAD THE EasyBuild MODULE module use /sw/easybuild/default/modules/all # add this to your bashrc module load EasyBuild # CREATE DIRS AND A FIRST CONFIGFILE FOR YOUR CPU ARCHITECTURE mkdir -p /sw/easybuild/configfiles # directory for your configs mkdir -p /sw/easybuild/sources mkdir -p /sw/easybuild/stacks/skylake/2020a vim /sw/easybuild/configfiles/skylake-2020a.cf # Enter the values from the example below # CREATE AN ALIAS IN YOUR BASHRC FOR EVERY SPECIFIC CONFIGFILE, E.G alias ebsky-2020a='eb --configfile=/sw/easybuild/configfiles/skylake-2020a.cfg' # YOU CAN LIST ALL AVAILABLE CONFIG OPTIONS WITH eb -a # OR YOU REDIRECT THE OUTPUT OF eb --confighelp TO GET AN ANNOTADED CONFIGFILE eb --confighelp >> myconfig.cfg # TO SHOW YOUR CURRENT EB CONFIGURATION USE eb --show-config ebsky-2020a --show-config #NOTE: If you install software with EB and just use 'eb' you will install it into your default folder!
Example usage of EasyBuild
# Basic usage eb -h # short help eb -H # complete list of all options # Dry-run installation of the foss-2020a toolchain ebsky-2020a foss-2020a.eb -r -D # -D == dry-run / -r == install deps ebsky-2020a foss-2020a.eb -r -M # -M == show only missing dependencies # Real installation with all deps ebsky-2020a foss-2020a.eb -r
/sw/easybuild/stacks/ , a folder for every supported CPU architecture is created (broadwell, skylake, etc.). These contain subfolders of the specific EB toolchain release dates (i.e. 2018b, 2019a etc.) where one finds the modules and installed software. In addition the folders
easybuild_repo can be found here. The easybuild_repo folder is synced via a local git repository and contains a folder
/sw └── easybuild ├── configfiles ├── custom_easyconfigs ├── hooks # --> Python script which can alter the build process └── sources # --> place to store all downloaded source-files; prevents downloading a source multiple times ├── stacks ├── broadwell ├── skylake └── 2018b └── 2019a ├── modules └── software ├── ... ├── ...
Further Configuration of Easybuild
For every toolchain release there exists a base configuration file inside the folder
configfiles. Personal settings of different admins can be made in local config.cfg files in
~/.config/easybuild/. Configfiles can be used via
Files which are listed within
--configfile are treated first.
Example of a basic skylake-2020a.cfg:
[config] prefix = /sw/easybuild/stacks/skylake/2020a module-naming-scheme = HierarchicalMNS sourcepath=/sw/easybuild/sources/ robot-paths=/sw/easybuild/custom_easyconfigs:%(DEFAULT_ROBOT_PATHS)s group-writable-installdir=true
Using different config files for different architectures and toolchain releases helps to handle heterogeneous systems and to keep an overview of installed software.
Create a bash alias for every config file for ease of use.
alias ebsky-2020a='eb --configfile=/sw/easybuild/configfiles/skylake-2020a.cfg'
Archive and self written Easyconfig files
After every successful build, the used Easyconfig file will be be archived in
stacks/$ARCH/$RELEASE/easybuild_repo/. Self-written Easyconfig files can be stored in the folder
custom_easyconfigs and will be considered when searching for software. A guide on how to write your own easyconfig file can be found here.
SLURM can be used as a job backend to compile multiple programs at the same time. Dependencies will be resolved automatically and the order of running jobs is taken care of. Use the flags
--job when you run eb. You can add the following to your config files. Modify appropriately.
job-backend=Slurm job-cores=<NCORES> job-max-jobs=<NJOBS>
Hooks are small Python scripts which can directly influence the build process. These can be e.g. used to make site-specific adjustment to Easyconfig file without the need to create a completely new one each time.
An example hook script is given below, which adds some slurm specific configuration, adds flags to every OpenMPI build and points to a license file for intel software installations:
import sys, os from easybuild.tools.build_log import print_msg from distutils.version import LooseVersion def start_hook(*args, **opts): if "--job" in sys.argv: # Check if env var was set slurm_partition = os.getenv("SBATCH_PARTITION") if slurm_partition is not None: print_msg("[start-hook] SLURM_PARTITION ENV VAR set: %s."%slurm_partition) else: slurm_partition = "normal" slurm_mem_per_node = os.getenv("SBATCH_MEM_PER_NODE") if slurm_mem_per_node is not None: print_msg("[start-hook] SLURM_MEM_PER_NODE ENV VAR set: %s."%slurm_mem_per_node) else: slurm_mem_per_node = "36G" import easybuild.tools.job.slurm as slurm class slurm_job(slurm.SlurmJob): def __init__(self, *args, **opts): super(slurm_job, self).__init__(*args, **opts) self.job_specs['partition'] = slurm_partition self.job_specs['mem'] = slurm_mem_per_node self.job_specs['time'] = '12:00:00' slurm.SlurmJob = slurm_job print_msg("[start-hook] using partition << %s >> "%slurm_partition) def pre_prepare_hook(self, *args, **kwargs): # SET PATH TO INTEL LICENSE FILE if self.name in ["icc", "ifort", "itac", "VTune"]: self.cfg['license_file'] = "/sw/licenses/USE_SERVER.lic" self.log.info("[pre-prepare hook] Setting path to license file: %s" % self.cfg['license_file'] ) print_msg("Intel license file: %s" % self.cfg['license_file']) def pre_configure_hook(self, *args, **kwargs): if self.name == 'OpenMPI': extra_opts = "" # Enable slurm and pmi support extra_opts += "--with-slurm --with-pmi" # Now add the options self.log.info("[pre-configure hook] Adding %s" % extra_opts) self.cfg.update('configopts', extra_opts)
About every 6 months EasyBuild releases new toolchains which combine a set of specific modules for compilers, MPI and numerical libraries (cf. common toolchains). The two most common are:
- intel + intelcuda –> icc/ifort, iMPI, MKL
- foss + fosscuda –> gcc, OpenMPI, OpenBLAS, LAPACK, ScaLAPACK, FFTW
Modules in this example are automatically created using the hierarchical module naming scheme. There are other options (the default being the EasyBuildModuleNamingScheme) available as well as the option to create your own site-specific module naming scheme. See this link.
A meta-module in this case is a module which makes a different module path available to the user while taking care of the correct path for the current architecture. An example of such a module is given below. These modules can e.g. reside in folder /sw/easybuild/meta-module/palma. The software in this example case is called palma and the versions correspond the EB toolchain releases, i.e. 2019b.lua, 2020a.lua etc. The
CPU_ARCH environment variable which is used here, is exported in a
modules.sh script within
help([==[ Description =========== This is a meta module giving you access to the PALMA 2020a software stack. Software on PALMA is build using the EasyBuild Python Framework. Supported CPU Architectures: skylake More information ================ - PALMA: https://confluence.uni-muenster.de/display/HPC - EasyBuild: https://easybuild.readthedocs.io ]==]) whatis([==[This is a meta module giving you access to the PALMA 2020a software stack. Software on PALMA is build using the EasyBuild Python Framework.]==]) whatis([==[PALMA: https://confluence.uni-muenster.de/display/HPC]==]) whatis([==[EasyBuild: https://easybuild.readthedocs.io]==]) local version = "2020a" local root = "/sw/easybuild/stacks" local cpu_arch = os.getenv("CPU_ARCH") local suffix = "/modules/all/Core" local hostname = subprocess("hostname -s") -- LmodMessage("cpu_arch = ", cpu_arch) -- LmodMessage("hostname = ", hostname) local cpu_arch = os.getenv("CPU_ARCH") if (string.find(hostname, "^r13n[01-12].*")) then cpu_arch = cpu_arch .. "-IB" end -- ONLY SKYLAKE HAS THIS TOOLCHAIN AT THE MOMENT if (cpu_arch ~= "skylake") then -- THROWS AN ERROR MESSAGE AND EXIT LmodError(version, "IS NOT YET AVAILABLE ON THE ", cpu_arch, "ARCHITECTURE.") end conflict("palma") prepend_path("MODULEPATH", pathJoin(root, cpu_arch, version, suffix)) -- add_property("lmod",)
Using the correct modules path
We created a
modules.sh file inside of
/etc/profile.d/ on the compute and login nodes instead of symlinking directly to
lmod/init/profile. This deals with some issues Lmod encounters with slurm. In case of an interactive slurm session, it also has to reset
$MODULEPATH before updating/reloading the currently loaded modules. Otherwise updating the modules will not unload the old path since we might be on a node with a different architecture than on the login node.
#!/bin/sh # NOTE: In a slurm batch job (non-interactive, non-login) this file is not sourced. # Only the file where $BASH_ENV points to is sourced in this case. # CPU ARCH ENV VARIABLE FOR META MODULES if [ -f "/sys/devices/cpu/caps/pmu_name" ];then export CPU_ARCH=`cat /sys/devices/cpu/caps/pmu_name` fi # Special treatment if we are in an interactive slurm session if [ ! -z "$SLURM_NODELIST" ];then echo "YOU ARE NOW IN AN INTERACTIVE SLURM JOB" # Reset module path export MODULEPATH=/sw/easybuild/meta-modules/ # Update the currently loaded modules to account for architecture specific paths module update # re-source the bash init to gain module autocomplete functionality in an interactive session . /opt/lmod/lmod/init/bash >/dev/null return fi # Enable bash module support . /opt/lmod/lmod/init/profile >/dev/null
- There are still a lot of (low-level) modules for system libraries exposed to the users –> A solution can be using hidden modules
- Instead of using meta-modules, one can use bind mounts on every node, such that the architecture specific folder points to a generall easybuild install folder (symlinks are know to cause issues)
- An automated build process for all architectures
- ‘module spider’ output pointing to toolchain instead of compiler plus MPI library