Installing software

How to install unavailable software?

Basic principles

  • On a cluster, it’s important that software is available and identical on all nodes, both login and compute nodes (see Workload scheduler). For self-installed software, it’s easier to install the software in one shared location than installing and maintaining the same software separately on every single node. You should therefore install your software on one of the network shares (eg, your $HOME folder or an umbrella or bulk folder) that are accessible from all nodes (see Storage).

  • As a regular Linux user you don’t have administrator rights. Yet, you can do your normal work, including installing software in a personal folder, without needing administrator rights. Consequently, you don’t need (nor are you allowed) to use the sudo or su commands that are often shown in manuals.

  • DAIC provides only 8GB of storage in the /home directories and the project spaces (/tudelft.net/...) are Windows-based leading to problems installing packages with pip due to file permission errors. However, /tudelft.net/... locations are mounted on all nodes. Therefore, the recommened way of using your own software and environments is to use containerization and to store your containers under /tudelft.net/staff-umbrella/.... Check out the Apptainer tutorial for guidance.

Managing environments

Conda/Mamba

Conda and Mamba are both package management and environment management tools used primarily in the data science and programming communities. Conda, developed by Anaconda, Inc., allows users to manage packages and create isolated environments for different projects, supporting multiple languages like Python and R. Mamba is a more recent alternative to Conda that offers faster performance and improved dependency solving using the same package repositories as Conda. Both tools help avoid dependency conflicts and simplify the management of software packages and environments. You can install it with:

Use module load conda

Miniconda is available as module.

$ module use /opt/insy/modulefiles  # If not already
$ module load miniconda 
$ which conda 
/opt/insy/miniconda/3.9/bin/conda

Creating a conda environment

To create a new environment you can run conda create:

$ conda create -n env
Collecting package metadata (current_repodata.json): done
Solving environment: done

==> WARNING: A newer version of conda exists. <==
  current version: 4.10.1
  latest version: 24.3.0

Please update conda by running

    $ conda update -n base -c defaults conda

 ## Package Plan ##

  environment location: /home/nfs/username/.conda/envs/env

Creating a conde environment from a YAML file

Conda allows you to create environments from a YAML file that specifies the packages and their versions for the desired environment. This feature makes it easier to reproduce environments across different machines and share environment configurations with others.

$ conda env create -f environment.yml (-n new-name)

For how to create a environment.yml file see Exporting environments

Environment variables

You can set enviromnet variables to install packages and environments in other locations:

  • CONDA_PREFIX: This variable points to the active conda environment’s root directory. When an environment is active, CONDA_PREFIX contains the path to that environment’s root directory.

  • CONDA_ENVS_DIRS: This variable specifies the directories where conda environments are stored. You can set it to a list of directories (separated by colons on Unix-like systems and semicolons on Windows). Conda will search for and store environments in these directories.

  • CONDA_PKGS_DIRS: This variable specifies the directories where conda stores downloaded packages. Like CONDA_ENVS_DIRS, you can set it to a list of directories. Conda uses these directories as cache locations for package downloads and installations.

Examples

  • Set conda environments directory:
$ export CONDA_ENVS_DIRS="/tudelft.net/staff-umbrella/my-project/"

A caveat is that the /tudelft.net mounts are windows based and therefore have compatibility issues with pip. When you create your conda environments there you will not be able to use pip to install packages. It is therefore recommeneded to keep the conda environments minimal and in your home directory, and to use containerization for larger environments.

List existing environments

You can list environments with

$ conda env list

Activating environments

You can activate an existing environemnt with conda activate, for example to install more packages:

$ conda activate env  # Activate the newly created environment 

Modifying environments

Sometimes you need to add/remove/change packages and libraries in existing environments. First, activate the enviroment you want to change with conda activate and then run conda install package-name or conda remove package-name. You can also use pip to install packages inside a conda environment, but for that pip has to be installed inside the environment. To make sure pip is installed in your enviroment run conda install pip first.

(env) $ conda install pandas  # Add a new package to the active environment
Collecting package metadata (current_repodata.json): done
Solving environment: done

==> WARNING: A newer version of conda exists. <==
  current version: 4.10.1
  latest version: 24.3.0

Please update conda by running

    $ conda update -n base -c defaults conda

## Package Plan ##

  environment location: /home/nfs/sdrwacker/.conda/envs/test

  added / updated specs:
    - pandas

The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    blas-1.0                   |              mkl           6 KB
    bottleneck-1.3.7           |  py312ha883a20_0         140 KB
    bzip2-1.0.8                |       h5eee18b_5         262 KB
    expat-2.6.2                |       h6a678d5_0         177 KB
    intel-openmp-2023.1.0      |   hdb19cb5_46306        17.2 MB
    ld_impl_linux-64-2.38      |       h1181459_1         654 KB
    libffi-3.4.4               |       h6a678d5_0         142 KB
    libuuid-1.41.5             |       h5eee18b_0          27 KB
    mkl-2023.1.0               |   h213fc3f_46344       171.5 MB
    mkl-service-2.4.0          |  py312h5eee18b_1          66 KB
    mkl_fft-1.3.8              |  py312h5eee18b_0         204 KB
    mkl_random-1.2.4           |  py312hdb19cb5_0         284 KB
    ncurses-6.4                |       h6a678d5_0         914 KB
    numexpr-2.8.7              |  py312hf827012_0         149 KB
    numpy-1.26.4               |  py312hc5e2394_0          11 KB
    numpy-base-1.26.4          |  py312h0da6c21_0         7.7 MB
    openssl-3.0.13             |       h7f8727e_0         5.2 MB
    pandas-2.2.1               |  py312h526ad5a_0        15.4 MB
    pip-23.3.1                 |  py312h06a4308_0         2.8 MB
    python-3.12.3              |       h996f2a0_0        34.8 MB
    pytz-2023.3.post1          |  py312h06a4308_0         197 KB
    readline-8.2               |       h5eee18b_0         357 KB
    setuptools-68.2.2          |  py312h06a4308_0         1.2 MB
    six-1.16.0                 |     pyhd3eb1b0_1          18 KB
    sqlite-3.41.2              |       h5eee18b_0         1.2 MB
    tbb-2021.8.0               |       hdb19cb5_0         1.6 MB
    tk-8.6.12                  |       h1ccaba5_0         3.0 MB
    tzdata-2024a               |       h04d1e81_0         116 KB
    wheel-0.41.2               |  py312h06a4308_0         131 KB
    xz-5.4.6                   |       h5eee18b_0         651 KB
    zlib-1.2.13                |       h5eee18b_0         103 KB
    ------------------------------------------------------------
                                           Total:       266.1 MB

The following NEW packages will be INSTALLED:

  _libgcc_mutex      pkgs/main/linux-64::_libgcc_mutex-0.1-main
  _openmp_mutex      pkgs/main/linux-64::_openmp_mutex-5.1-1_gnu
  blas               pkgs/main/linux-64::blas-1.0-mkl
  bottleneck         pkgs/main/linux-64::bottleneck-1.3.7-py312ha883a20_0
  bzip2              pkgs/main/linux-64::bzip2-1.0.8-h5eee18b_5
  ca-certificates    pkgs/main/linux-64::ca-certificates-2024.3.11-h06a4308_0
  expat              pkgs/main/linux-64::expat-2.6.2-h6a678d5_0
  intel-openmp       pkgs/main/linux-64::intel-openmp-2023.1.0-hdb19cb5_46306
  ld_impl_linux-64   pkgs/main/linux-64::ld_impl_linux-64-2.38-h1181459_1
  libffi             pkgs/main/linux-64::libffi-3.4.4-h6a678d5_0
  libgcc-ng          pkgs/main/linux-64::libgcc-ng-11.2.0-h1234567_1
  libgomp            pkgs/main/linux-64::libgomp-11.2.0-h1234567_1
  libstdcxx-ng       pkgs/main/linux-64::libstdcxx-ng-11.2.0-h1234567_1
  libuuid            pkgs/main/linux-64::libuuid-1.41.5-h5eee18b_0
  mkl                pkgs/main/linux-64::mkl-2023.1.0-h213fc3f_46344
  mkl-service        pkgs/main/linux-64::mkl-service-2.4.0-py312h5eee18b_1
  mkl_fft            pkgs/main/linux-64::mkl_fft-1.3.8-py312h5eee18b_0
  mkl_random         pkgs/main/linux-64::mkl_random-1.2.4-py312hdb19cb5_0
  ncurses            pkgs/main/linux-64::ncurses-6.4-h6a678d5_0
  numexpr            pkgs/main/linux-64::numexpr-2.8.7-py312hf827012_0
  numpy              pkgs/main/linux-64::numpy-1.26.4-py312hc5e2394_0
  numpy-base         pkgs/main/linux-64::numpy-base-1.26.4-py312h0da6c21_0
  openssl            pkgs/main/linux-64::openssl-3.0.13-h7f8727e_0
  pandas             pkgs/main/linux-64::pandas-2.2.1-py312h526ad5a_0
  pip                pkgs/main/linux-64::pip-23.3.1-py312h06a4308_0
  python             pkgs/main/linux-64::python-3.12.3-h996f2a0_0
  python-dateutil    pkgs/main/noarch::python-dateutil-2.8.2-pyhd3eb1b0_0
  python-tzdata      pkgs/main/noarch::python-tzdata-2023.3-pyhd3eb1b0_0
  pytz               pkgs/main/linux-64::pytz-2023.3.post1-py312h06a4308_0
  readline           pkgs/main/linux-64::readline-8.2-h5eee18b_0
  setuptools         pkgs/main/linux-64::setuptools-68.2.2-py312h06a4308_0
  six                pkgs/main/noarch::six-1.16.0-pyhd3eb1b0_1
  sqlite             pkgs/main/linux-64::sqlite-3.41.2-h5eee18b_0
  tbb                pkgs/main/linux-64::tbb-2021.8.0-hdb19cb5_0
  tk                 pkgs/main/linux-64::tk-8.6.12-h1ccaba5_0
  tzdata             pkgs/main/noarch::tzdata-2024a-h04d1e81_0
  wheel              pkgs/main/linux-64::wheel-0.41.2-py312h06a4308_0
  xz                 pkgs/main/linux-64::xz-5.4.6-h5eee18b_0
  zlib               pkgs/main/linux-64::zlib-1.2.13-h5eee18b_0


Proceed ([y]/n)? y
....

Exporting environments

You can export versions of all installed packages and libaries inside a coda environment with conda env export. It is good practice to keep track of all versions that you have used for a particular experiment by exporting it into a YAML file typically called environment.yml:

$ conda env export --no-builds > environment.yml

Install your own mamba/conda

Sometimes the versions provided by module are outdated and users need their own installation of conda or mamba. A minimal version can be installed as demonstrated in the following:

$ alias install-miniforge='
    wget https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-Linux-x86_64.sh \
    && bash Miniforge3-Linux-x86_64.sh -b \
    && rm -f Miniforge3-Linux-x86_64.sh \
    && eval "$($HOME/miniforge3/bin/conda shell.bash hook)" \
    && conda init \
    && conda install -n base -c conda-forge mamba'

$ cd ~ && install-miniforge

(base) $  # This shows that the 'base' environment is active.
(base) $ which python
~/miniforge3/bin/python

This will already occupy around 500MB of your home directory totalling ~20k files.

$ du -h miniforge3 --max-depth=0
486M	miniforge3

$ find miniforge3 -type f | wc -l
20719

Now, you can install your own versions of libraries and programs, or create entire environments as descibed above.

Using binaries

Some programs come as precompiled binaries or are written in a scripting language such as Perl, PHP, Python or shell script. Most of these programs don’t actually need to be “installed” since you can simply run these programs directly. In certain scenarios, you may need to make the program executable first using chmod +x:

$ ./my-executable        # attempting to run the binary `my-executable`
-bash: ./my-executable: Permission denied

$ chmod +x program       # making `my-executable` executable, since it fails due to permissions

$ ./my-executable        # checking `my-executable` works!
Hello world!

Installing from source

When a pre-made binary of your software is not available, you’ll have to install the software yourself from the source. You may need to set up your Installation environment before following this Installation recipe.

Installation environment

When you are installing software for the very first time, you need to set up your environment. If you have already done this before , you can skip this section and go directly to the Installation recipe section.

To set up your environment, first, add the following lines to your ~/.bash_profile or, alternatively, download this (bash_profile.txt) as shown in the subsequent commands:

bash_profile.txt

# Get the aliases and functions
if [ -f ~/.bashrc ]; then
   . ~/.bashrc
fi

# User specific environment and startup settings
export PREFIX="$HOME/.local"
export ACLOCAL_PATH="$PREFIX/share/aclocal${ACLOCAL_PATH:+:$ACLOCAL_PATH}"
export CPATH="$PREFIX/include${CPATH:+:$CPATH}"
export LD_LIBRARY_PATH="$PREFIX/lib64:$PREFIX/lib${LD_LIBRARY_PATH:+:$LD_LIBRARY_PATH}"
export LIBRARY_PATH="$PREFIX/lib64:$PREFIX/lib${LIBRARY_PATH:+:$LIBRARY_PATH}"
export MANPATH="$PREFIX/share/man${MANPATH:+:$MANPATH}"
export PATH="$HOME/bin:$PREFIX/bin:$PATH"
export PERL5LIB="$PREFIX/lib64/perl5:$PREFIX/share/perl5${PERL5LIB:+:$PERL5LIB}"
export PKG_CONFIG_PATH="$PREFIX/lib64/pkgconfig:$PREFIX/share/pkgconfig${PKG_CONFIG_PATH:+:$PKG_CONFIG_PATH}"
export PYTHONPATH="$PREFIX/lib/python2.7/site-packages${PYTHONPATH:+:$PYTHONPATH}"

$ cp ~/.bash_profile ~/.bash_profile.bak # back up your file
$ curl -s https://wiki.tudelft.nl/pub/Research/InsyCluster/InstallingSoftware/bash_profile.txt >> ~/.bash_profile # download and append the lines above

Then, clean up any duplicate settings, and:

$ source ~/.bash_profile
$ mkdir -p "$PREFIX"

The line export PREFIX="$HOME/.local" sets your software installation directory to /home/nfs/<YourNetID>/.local (which is the default and accessible on all nodes). This is in your personal home directory where you have a space quota of 8GB. However, for software for your research project, you should instead use a project share, for example:

export PREFIX="/tudelft.net/staff-umbrella/project/software"

The other variables will let you use your self-installed programs. You are now ready to install your software!

Installation recipe

Software installation usually just requires you to follow the general installation recipe described below, but you always need to consult the documentation for your software.

  1. Place the source of the software in a folder under /tmp:
$ mkdir /tmp/$USER
$ cd /tmp/$USER

You can sometimes download the software directly from the internet:

$ wget http://host/path/software.tar.gz
$ tar -xzf software.tar.gz

Or, clone the software from a git repository:

$ git clone https://github.com/software

Then:

$ cd software
  1. If the software provides a configure script, run it:
$ ./configure --prefix="$PREFIX" 

If configure complains about missing software, you’ll either have to install that software, tell configure where it is (--with-feature _path_=) or disable the feature (--disable-feature).

If your software provides a CMakeLists.txt file, run cmake (note: the trailing two dots on the last line are needed exactly as shown):

$ mkdir -p build $ cd build $ cmake -DCMAKE_INSTALL_PREFIX="$PREFIX" .. 

Again, if cmake complains about missing software, you’ll either have to install that software or tell cmake where it is (-DCMAKE_SYSTEM_PREFIX_PATH="/usr/local;/usr;$PREFIX;path").

If neither is provided, consult the documentation for dependencies and configuration (specifically for the installation directory).

There is no point in continuing until all reported problems have been fixed.

  1. Compile the software:
$ make 

If compilation is aborted due to an error, Google the error for possible solutions. Again, there is no point in continuing until all reported problems have been fixed.

  1. Install the software. When you used configure or cmake, you can simply run:
$ make install 

When you used neither, you need to use:

$ make prefix="$PREFIX" install 
  1. Your software should now be ready to use, so check it:
$ cd $ _program_ 
  1. When the program works, clean up /tmp/netid:
$ rm -r /tmp/$USER