Capabilities of Singularity containers
About this material
This guide has been developed as support material for March 5, 2021, online seminar “Singularity Container Capabilities”. This event has been made possible by the EuroHPC project “National Competence Centers in the framework of EuroHPC” (EuroCC). This project has started on September 1, 2020, and Latvia is represented in this project by the HPC Center of Riga Technical University together with the Institute of Numerical Modelling of the University of Latvia. The EuroCC project aims to create a European network of supercomputing competence centres. 33 countries are involved in the project; it will last for 2 years with total funding of more than 56 million EUR.
The task of the competence centres established during the project will be to create a unified support structure to promote the use of supercomputing opportunities in higher education, research, public administration, and industry. The Competence Centres will bring together the competencies, experience, and computing resources available in all EU countries.
Introduction
When using high-performance computing (HPC) resources, it often becomes important to use non-standard software that is not available in the form of installed, ready-to-use software modules. In this situation, you can ask the administrator for help installing the software. But often this does not completely solve the problem:
- The desired software may not be compatible with the operating system used in the cluster.
- Many different software packages with specific versions need to be installed.
- The user may want to use an identical environment on different clusters/workstations.
In the described situation, it is very suitable to use the software in the form of containers – a fully equipped and encapsulated software environment, which allows you to perform your desired calculation tasks on different systems. To use the required software environment (container), all you have to do is copy/obtain the desired container file.
Demonstration of basic functionality
Below is an example of the basic functionality of software containers on the LU INM HPC cluster. The example considers a case where a container with Anaconda Python is already stored in a user’s folder (a tool with a series of Python tools that is typically used to process HPC data with Python).
# See that base OS offers only old and incomplete Python version (2.x.x.)
python --version
# It is possible to use software module provided by the administrators
# See that the module indeed offers feature-rich Anaconda Python (3.x.x.)
module avail
module load anaconda3/anaconda-2020.11
python --version
# However, there is an alternative – use of containers!
# Detach previously loaded modules
module purge
# Load 'singularity' module
module load singularity/3.4.1
# Use of available container (for executing a single command)
# See that, indeed, container features Anaconda Python (3.x.x.)
singularity exec ~/my_containers/my_anaconda.simg python --version
# Use of available container (for interactive shell)
singularity shell ~/my_containers/my_anaconda.simg
python --version
# While being in the container, see that all the user files are accessible
ls ~
From the example shown, it can be seen that containers provide the use of a particular software environment. However, it is not as cumbersome as using a separate virtual machine, which is usually not possible or suitable for HPC clusters. It should be noted that the use of containers can, of course, be specified in queue system scripts like any other command.
Singularity and Docker comparison
In recent years, software containerization has gained popularity in the form of the widely used platform Docker – the community of this platform has created many different, freely available container images in the DockerHub repository. However, there is an alternative containerization solution that is more suitable and popular in the HPC environment: Singularity. This solution has several advantages that make it particularly suitable for HPC:
- Unlike Docker, the use of Singularity containers does not require administrator rights. This allows you to make full use of these containers in an HPC environment. (Administrator rights are still required to create containers, however.)
- Singularity containers offer ready-made solutions for parallel calculations (OpenMPI) as well as GPU calculations.
- Singularity is compatible with Docker containers, allowing you to take advantage of the points above while allowing the use of the DockerHub repository.
Based on these advantages, as well as the fact that Singularity is available on both the LU INM cluster and the RTU HPC Center cluster, this guide specifically addresses the use of Singularity containers.
Obtaining DockerHub containers
The previous demonstration with the Anaconda Python container was possible because it was obtained from the DockerHub repository. To obtain/build this container file, proceed as follows:
# Load 'singularity' module
module load singularity/3.4.1
# Obtain container using the link seen on DockerHub
# ‘pull’ will download the container within few seconds
singularity pull ~/my_containers/my_anaconda.simg docker://continuumio/anaconda3
# Optionally – you can also use ‘build’ command
# This converts the image to the newest available image format
# In this case, build takes around 3 min
singularity build ~/my_containers/my_anaconda.simg docker://continuumio/anaconda3
The DockerHub repository contains a number of different containers, which contain sets of software tools suitable for typical tasks in various fields – physics calculations, data processing, and IT solution tools (e.g., databases). However, there are several repositories besides DockerHub, where software containers are freely available.
Title | URL | singularity pull prefix |
Singularity Library | https://cloud.sylabs.io/library | singularity pull library:// |
Docker Hub | https://hub.docker.com | singularity pull docker:// |
Singularity Hub | https://singularity-hub.org | singularity pull shub:// |
NVIDIA GPU Cloud | https://ngc.nvidia.com | singularity pull docker://nvcr.io/ |
Installing Singularity and creating containers from scratch
Despite the extensive repository resources, sometimes it is necessary to create your own, specific container. Because this operation requires administrator privileges, it would typically be performed on your personal computer (or a Linux virtual machine available to you).
The example below considers the creation of a container for using ESPResSo software, a specific tool for soft matter physics simulations. Installing this software requires several different additional tools, which can be time-consuming for the user and HPC cluster administrators. This situation makes the ESPResSo program a great candidate for a container demonstration.
Installing Singularity on your personal computer
To be able to build your own containers, you need to install Singularity on a Linux computer with administrator privileges. For those who do not use Linux on their personal computer, it is worth mentioning that any computer with Windows 10 Enterprise / Pro already has a built-in option to run virtual instances of Linux. The example below was implemented on an Ubuntu 20.04 LTS virtual machine. Following this Singularity installation guide, a number of dependencies were first installed:
sudo apt-get update && sudo apt-get install -y \
build-essential \
libssl-dev \
uuid-dev \
libgpgme11-dev \
squashfs-tools \
libseccomp-dev \
wget \
pkg-config \
git \
cryptsetup
Lai būtu iespējas kompilēt un iegūt lietošanā Singularity, nepieciešamas programmēšanas valodas Go rīki, kurus var lejupielādēt golang.org mājaslapā. Instalācija veicama sekojoši:
# Atarhivē lejupielādēto arhīvu
tar -C /usr/local -xzf go1.16.linux-amd64.tar.gz
# Pievieno ‘go’ mapi datora PATH
export PATH=$PATH:/usr/local/go/bin
# Restartē datoru vai arī izpilda komandu:
$HOME/.profile
To be able to compile and use Singularity, you need Go programming tools, which can be downloaded from golang.org. The installation is performed as follows:
# Extracting the downloaded archive
tar -C /usr/local -xzf go1.16.linux-amd64.tar.gz
# Adding the ‘go’ folder to system PATH
export PATH=$PATH:/usr/local/go/bin
# Reboot computer or execute command:
$HOME/.profile
You can continue with the installation of the Singularity program itself. The code of the latest version of the program can be downloaded from the GitLab site.
# Extracting archive and changing directory
tar -xzf singularity-3.7.1.tar.gz
cd singularity
# Install singularity by specifying location (here /usr/local is chosen)
./mconfig -b ./buildtree -p /usr/local
cd ./singularity/buildtree
make
sudo make install
# Check if installation was successful
singularity version
After completing the steps above, Singularity has been successfully installed and it is possible to use Singularity on your computer and prepare containers for use elsewhere, e.g., HPC clusters.
Create container using a recipe file
You can create your own containers using the container recipe file. Its contents indicate what software, what parameters and other functional aspects should be included within the container file that is about to be built. You can read about the blocks and parameters in the contents of these recipe files in this manual. However, it is extremely valuable to get acquainted with and base your work on the various examples of recipe files compiled in this GitHub repository.
Below is an example of a recipe file that creates a container with the aforementioned ESPResSo software. As you can see from the contents of this file, the operation is similar as with a typical computer: for the most part, the file consists of the %post section, with contents identical to the ESPResSo installation guide.
Bootstrap: docker
From: ubuntu:18.04
%help
Container with ESPResSo
ESPResSo means: Extensible Simulation Package for Research on Soft Matter
Website: http://espressomd.org/wordpress/
Main binary for using ESOResSo via this container is 'pypresso'.
%post
# Updating repositories and packages
apt-get update && apt-get -y upgrade
# Installing all dependencies (as shown on ESPResSo website)
apt-get -y install wget tar build-essential \
cmake cython3 python3-numpy libboost-all-dev \
openmpi-common fftw3-dev libhdf5-dev \
libhdf5-openmpi-dev python3-opengl libgsl-dev
# Installing ESPResSo (as shown on ESPResSo website)
cd ~
wget -c \
https://github.com/espressomd/espresso/releases/download/4.1.4/espresso-4.1.4.tar.gz
tar zxvf espresso-4.1.4.tar.gz
mkdir /usr/bin/espresso && cd /usr/bin/espresso
cmake ~/espresso
cmake --build .
# Cleaning up
apt-get clean
apt-get autoremove
%environment
# Something about locale... everyone seems to be doing this.
export LC_ALL=C
# Adding ESPResSo directory to PATH, so that the command 'pypresso' is available
export PATH=/usr/bin/espresso:$PATH
Once the recipe file has been created and saved as ESPResSo_container_recipe.def, you can build a container using the command below.
sudo singularity build ESPResSo_container.sif ESPResSo_container_recipe.def
In this case, the process took 5-10 minutes (downloading the basic environment and packages, compiling the specific software). The size of the container file created at the end is about 400 MB. Once a container file has been created, it can be used in the same way as previously shown, with exec and shell commands. The container file can be copied and used elsewhere, including an HPC cluster.
singularity exec ESPResSo_container.sif pypresso <pypresso script file>
Using containers with GPU-compute capabilities
Although containers are encapsulated software environments, their execution can be linked to base operating system resources and hardware: enabling the use of system GPU cards and supplying external software resources (such as CUDA libraries for use inside a container). Below is a brief example of using a Tensorflow container with GPU acceleration. Tensorflow is a popular tool used to perform machine learning tasks.
# Load 'singularity' module
module load singularity/3.4.1
# Load CUDA library
# This module containes information where to look for CUDA commands
# within the system (‘/bin’ folder), and defines the CUDA_HOME variable
module load cuda/cuda-10.2
# Using the container from DockerHub while enabling use of GPU (‘--nv’),
# as well as bind the CUDA library folder with ‘--bind ..’.
singularity shell --nv --bind ${CUDA_HOME} \
~/my_containers/tensorflow_2.3.1-gpu.sif
Similarly, any other external software can be passed to the container by using the bind parameter. Thus, containers can be made smaller: for example, it is not necessary to include Anaconda Python in a container if it is clearly known that it can be passed through with the bind parameter. In this case, only the specific/non-standard software should be included in the container.
Topic discussed in this last section is thoroughly analyzed in the presentation that was shown during the seminar on March 5. It is made available to you as PDF file: