This section of best practice is based on the wisdom distilled from the EasyBuild Workshop at the University of Birmingham, 22nd October 2018, hosted by Andrew Edmondson, Research Software Group Leader, Advanced Research Computing, with Kenneth Hoste, lead EasyBuild developer at the University of Ghent.
22 people from the UK academic HPC community attended the event.
Many thanks to Andrew for organising it, and to Kenneth and the University of Ghent for presenting, as well as the other volunteer presenters giving their experiences with EasyBuild.
The agenda for the meeting was:
10:00 Registration and coffee
10:15 EasyBuild introduction/history/future (Kenneth Hoste)
11:00 How it’s done at various universities
11:45 -Questions and open discussion of common problems
13:00 – Contributing back to the community (Kenneth Hoste)
14:00 – Workshop time where we help each other solve problems
The presentations are attached here.
- Kenneth Hoste: EasyBuild_20181022_HPC-SIG_UK
- University of Birmingham: EasyBuild Workshop
- University of Sussex: UniofSussex-with-Kenneth-feedback
Questions and Answers
Managing package updates
Q: How do you manage package updates – say I want to rebuild a package with some additional compiler options. Do you just build in place, or build a new version.
A: It is best to build a new version to retain the old version for reproducibility, by customising the script to add additional qualifiers, but an in place build can be done.
Q: How do you manage package updates? Say move
abc-1.2.3-foss-2018b? Can you automate this along with the dependencies?`
A: Yes. For example, if you have Python 3.7 you can try
eb original-version.eb --try-software-version 3.7.1. This keeps the original dependencies, which may or may not be correct, so can use cached EB version to create a new EB if required.
eb easybuild-file.eb --try-toolchain-version foss,2018.08. In this case new versions of dependencies are created.
There may be dependency hell issues doing this, e.g. compiling lot of stuff into R, say. You can create bare bones version and then side load other modules with EB, or let the users compile against R. But can cause issues for users. For dependencies you can use
module load or
module req in higher level modules (e.g.
module load numpy-1.18-python-2.7.12 loads
Ghent builds for new foss every 6 months. easyupdate?
Note that Anaconda – installs pre-compiled binaries on a generic platform for x86, and may not be efficient, as it won’t necessarily include AVX, AVX2 and other optimisations.
Linking to existing Intel installation
Q: How do I set up
intel-2017a or similar to find my existing Intel installation?
A: This can be done via external modules. See documentation. Also use the metadata file to specify prefix, etc. Pass the module to easybuild. You will need to create a custom toolchain for this, marking
EXTERNAL_MODULE in the toolchain definition for your existing Intel modules. You must also have
icc/2017a, etc as a modules. For MKL you will have to build the wrappers for MKL for FFTW as well as just MKL.
There is also an
gcc-system works well for GCC, unknown for icc. There is also a similar thing for Intel MPI. However
system-mkl does not exist for MKL
More difficult to do this than to download the ICC installation files.
Further information from post-workshop experience:
It has turned out to be relatively simple to achieve this by using an intel/2017a module that includes icc/2017a, ifort/2017a, imkl/2017a and impi/2017a, each of which are simple modules that load the pre-existing Intel modules on the system. To get FFTW to work, the wrappers need to be compiled for this, and the libraries built moved to the MKL library location. Instructions are available from Intel on how to do this.
Q: How do I define a custom toolchain to link with stuff I’ve already built?
A: Use modules, and use
EXTERNAL_MODULE, but is prone to failure.
Q: Should I define a custom toolchain?
A: Not recommended.
Further information from post-workshop experience:
EasyBuild provides toolchains for
YYYYb for GCC, Intel, etc., which may not match what is installed, but a
YYYY.XX can be defined to use a different version. This MOSTLY works, except there are issues with some dependencies not easily built, and can be time consuming, as
--try-toolchain needs to be used to convert the nearest equivalent
YYYYb EasyBuild models to
YYYY.XX. Also the end-user software built in this way will not have gone through the EasyBuild test process. It is much faster to install the elements for the required toolchain.
Q: My builds end up failing with reloc errors. How do I work around this in easybuild?
A: Usually happens when binutils is not in the easybuild definition, or different binutils versions.
How to avoid so many modules
Q: Easybuild build so much – how do I avoid every build creating a version of GCC, etc?
A: Use fewer toolchain versions, and swap to new versions of toolchains. Also modules can be built as ‘hidden’ to avoid end-users seeing all the underlying dependencies.
Multiple ‘Flavours’ of modules
Q: The build seems to build several differently ‘flavours’ of a version of
flex, such as a plain one,
GCC-5.4.0-2.26, etc. Is this necessary?
A: This happens for some dependencies.
flex would be the system tools.
GCCcore – needs
binutils, which needs
flex, so you need a system compiler version of
flex. At that point
flex-2.6.4 isn’t used, but EB doesn’t then build this. You could install these dependencies as hidden,
--hide-deps, so they don’t show up via
module avail, but they do show up with
module spider shouldn’t show them either unless used as
module --show-hidden spider ...
Failing finding multiple versions of a package
Q: Sometimes builds fail with issues of finding multiple versions of a package. What might I be doing wrong?
A: There a few to no instances where this should occur, as it is checked for the common toolchains in the EB testing procedure. Using uncommon toolchains may be problematic. Can sometime be an issue with module loads.
LLVM (clang) and IBM XL
Q: Is there a way of supporting LLVM-based toolchains? Or IBM XL, etc? Or do I need to create toolchains for these?
clang – yes, but not
flang. These toolchains are rarely used. IBM XL
xlmpich2 – not known how well they work
Continuous Integration with Jenkins
Q: Has anyone integrated EasyBuild with Jenkins for continuous integration testing of applications and open-sourced it so I can cheat and use that off-the-shelf?
A: Yes, this is being done by Swiss National Supercomputing Centre
Integration with Python
Q: Is it possible to directly integrate EasyBuild with Python scripts to do wider tasks (regenerate web docs) without running easybuild on the command line from Python?
This is possible, but not very easy, e.g.
from easybuild.tools.run import run_cmd. EasyBuild is not expected to be usable as libraries. There is a trick to make this work. Open an issue to get this fixed.
This will be improved in EasyBuild 3.8.0 upcoming EasyBuild v3.8.0, see https://github.com/easybuilders/easybuild-framework/pull/2638. If you want to use
easybuild as a library, you can after calling
Q: Are their slides available, please?
A: Yes, will be on the website. (See links on this page).
Q: The program is great when it works. However, when it fails debugging the problem can be a bit of a pain. Are there any good procedures?
A: Easybuild doesn’t really help with reporting error – have to look in the log files. Work to be done on this. AMBER can be difficult to build.
Q: How can we contribute easy build recipes back to the community – UK and wider?
A: See presentation.
Q: What is the benefit of minimal toolchains?
Minimal toolchains (
--minimal-toochains) are only relevant when you have hierarchies of modules.