Python environments

Managing Python packages and environments on DAIC.

What you’ll learn

By the end of this tutorial, you’ll be able to:

  • Choose the right tool for your Python workflow
  • Create reproducible project environments with UV
  • Use Pixi for conda-forge packages
  • Set up global environments with Micromamba
  • Run Python jobs on the cluster
  • Troubleshoot common environment issues

Time: About 45 minutes

Prerequisites: Complete Bash Basics and Slurm Basics first.


Why environment management matters

On your laptop, you might install packages globally with pip install. This works until:

  • Project A needs torch 2.0 but Project B needs torch 1.13
  • You upgrade a package and break an old project
  • You can’t reproduce your results because you forgot which versions you used

On DAIC, these problems are amplified:

  • Quota limits: Your home directory is only 5 MB
  • Shared system: You can’t install packages system-wide
  • Reproducibility: Research requires knowing exactly what versions you used
  • Collaboration: Others need to run your code with the same dependencies

Environment management tools solve these problems by isolating each project’s dependencies.

The tools

DAIC supports several Python environment tools. Here’s when to use each:

ToolBest forKey feature
UVMost projectsFast, lockfiles, reproducible
PixiConda-forge packagesConda ecosystem, project-based
MicromambaShared environmentsTraditional conda workflow
ModulesPre-installed packagesZero setup

This tutorial covers all four, starting with UV (recommended for most users).


Part 1: UV - The modern Python workflow

UV is a fast Python package manager written in Rust. It replaces pip, virtualenv, and pip-tools with a single tool that’s 10-100x faster.

Why UV?

  • Speed: Installs packages in seconds, not minutes
  • Lockfiles: uv.lock records exact versions for reproducibility
  • Project-based: Each project has its own isolated environment
  • No activation needed: uv run handles everything

Installing UV

First, ensure your shell is configured for DAIC storage (see Shell Setup):

$ curl -LsSf https://astral.sh/uv/install.sh | sh

Restart your shell or run:

$ source ~/.bashrc

Verify the installation:

$ uv --version
uv 0.6.x

Creating a project

Navigate to your project storage and create a new project:

$ cd /tudelft.net/staff-umbrella/<project>
$ uv init ml-experiment
$ cd ml-experiment
$ ls
README.md  hello.py  pyproject.toml

UV created three files:

  • pyproject.toml: Project metadata and dependencies
  • hello.py: A sample Python file
  • README.md: Project documentation

Look at the project configuration:

$ cat pyproject.toml
[project]
name = "ml-experiment"
version = "0.1.0"
description = "Add your description here"
readme = "README.md"
requires-python = ">=3.12"
dependencies = []

Adding dependencies

Add packages with uv add:

$ uv add torch numpy pandas matplotlib
Resolved 15 packages in 234ms
Installed 15 packages in 1.2s
 + numpy==2.2.1
 + pandas==2.2.3
 + torch==2.5.1
 ...

UV automatically:

  1. Creates a virtual environment in .venv/
  2. Installs packages
  3. Updates pyproject.toml
  4. Generates uv.lock with exact versions

Check what was added:

$ cat pyproject.toml
[project]
...
dependencies = [
    "matplotlib>=3.10.0",
    "numpy>=2.2.1",
    "pandas>=2.2.3",
    "torch>=2.5.1",
]

The uv.lock file contains exact versions and hashes for reproducibility:

$ head -20 uv.lock
version = 1
revision = 2
requires-python = ">=3.12"

[[package]]
name = "numpy"
version = "2.2.1"
source = { registry = "https://pypi.org/simple" }
...

Running code

Use uv run to execute Python code:

$ uv run python -c "import torch; print(torch.__version__)"
2.5.1

Create a training script:

$ cat > train.py << 'EOF'
import torch
import numpy as np

print(f"PyTorch version: {torch.__version__}")
print(f"NumPy version: {np.__version__}")
print(f"CUDA available: {torch.cuda.is_available()}")

# Simple computation
x = torch.randn(1000, 1000)
y = torch.matmul(x, x.T)
print(f"Matrix multiplication result shape: {y.shape}")
EOF

$ uv run python train.py
PyTorch version: 2.5.1
NumPy version: 2.2.1
CUDA available: False
Matrix multiplication result shape: torch.Size([1000, 1000])

Using UV in Slurm jobs

Create a batch script that uses your UV project:

#!/bin/bash
#SBATCH --account=<your-account>
#SBATCH --partition=all
#SBATCH --time=1:00:00
#SBATCH --cpus-per-task=4
#SBATCH --mem=16G
#SBATCH --gres=gpu:1
#SBATCH --output=train_%j.out

module purge
module load 2025/gpu cuda/12.9

cd /tudelft.net/staff-umbrella/<project>/ml-experiment

echo "Starting training at $(date)"
srun uv run python train.py
echo "Finished at $(date)"

Submit it:

$ sbatch train_job.sh
Submitted batch job 12345

Installing PyTorch with CUDA

For GPU support, specify the PyTorch index:

$ uv add torch --index https://download.pytorch.org/whl/cu124

Or add it to pyproject.toml:

[tool.uv]
index-url = "https://download.pytorch.org/whl/cu124"

Installing CLI tools

UV can install command-line tools globally (independent of projects):

$ uv tool install ruff
$ uv tool install black
$ uv tool install jupyter

$ ruff --version
ruff 0.9.1

$ uv tool list
black v24.10.0
jupyter v1.0.0
ruff v0.9.1

Syncing on another machine

When you clone a project with UV, restore the exact environment:

$ git clone <repo-url>
$ cd ml-experiment
$ uv sync
Resolved 15 packages in 12ms
Installed 15 packages in 0.8s

The lockfile ensures you get the exact same versions.

Exercise 1: Create a UV project

  1. Create a new UV project called data-analysis
  2. Add pandas, scikit-learn, and matplotlib
  3. Create a script that loads a sample dataset and prints its shape
  4. Run it with uv run

Part 2: Pixi - When you need conda packages

Pixi is a fast, project-based package manager compatible with conda-forge. Use it when:

  • You need packages only available on conda-forge (not PyPI)
  • You need non-Python dependencies (CUDA, compilers, system libraries)
  • You’re working with conda-based toolchains

Installing Pixi

$ curl -fsSL https://pixi.sh/install.sh | sh
$ source ~/.bashrc

$ pixi --version
pixi 0.40.x

Creating a Pixi project

$ cd /tudelft.net/staff-umbrella/<project>
$ pixi init bioinformatics-project
$ cd bioinformatics-project
$ ls
pixi.toml

Adding packages

Add packages from conda-forge:

$ pixi add python=3.11 numpy pandas
$ pixi add biopython samtools  # packages not on PyPI

Check the configuration:

$ cat pixi.toml
[project]
name = "bioinformatics-project"
channels = ["conda-forge"]
platforms = ["linux-64"]

[dependencies]
python = "3.11.*"
numpy = "*"
pandas = "*"
biopython = "*"
samtools = "*"

Running commands

$ pixi run python -c "import Bio; print(Bio.__version__)"
1.84

$ pixi run samtools --version
samtools 1.21

Activating the environment

For interactive work, activate the environment:

$ pixi shell
(bioinformatics-project) $ python
>>> import numpy as np
>>> np.__version__
'2.2.1'
>>> exit()
(bioinformatics-project) $ exit
$

Using Pixi in Slurm jobs

#!/bin/bash
#SBATCH --account=<your-account>
#SBATCH --partition=all
#SBATCH --time=2:00:00
#SBATCH --cpus-per-task=8
#SBATCH --mem=32G
#SBATCH --output=analysis_%j.out

module purge
module load 2025/gpu cuda/12.9

cd /tudelft.net/staff-umbrella/<project>/bioinformatics-project

srun pixi run python analyze.py

Adding PyPI packages

Pixi can also install from PyPI:

$ pixi add --pypi transformers

Exercise 2: Create a Pixi project

  1. Create a Pixi project for genomics analysis
  2. Add python, biopython, and matplotlib
  3. Verify biopython is installed with pixi run python -c "from Bio import SeqIO"

Part 3: Micromamba - Global conda environments

Micromamba is a lightweight, standalone conda implementation. Use it when you need:

  • Traditional conda workflows
  • Environments shared across multiple projects
  • Compatibility with existing conda scripts

Installing Micromamba

$ "${SHELL}" <(curl -L micro.mamba.pm/install.sh)

When prompted for the installation location, use project storage:

Micromamba binary folder: /tudelft.net/staff-umbrella/<project>/micromamba/bin

Configure the environment prefix:

$ micromamba config set env_path /tudelft.net/staff-umbrella/<project>/micromamba/envs

Creating environments

$ micromamba create -n pytorch-env python=3.11 pytorch numpy -c conda-forge -c pytorch
$ micromamba activate pytorch-env

(pytorch-env) $ python -c "import torch; print(torch.__version__)"
2.5.1

Managing environments

$ micromamba env list
  Name        Active  Path
  pytorch-env    *    /tudelft.net/.../micromamba/envs/pytorch-env

$ micromamba deactivate

Installing additional packages

$ micromamba activate pytorch-env
(pytorch-env) $ micromamba install pandas scikit-learn -c conda-forge

Using Micromamba in Slurm jobs

#!/bin/bash
#SBATCH --account=<your-account>
#SBATCH --partition=all
#SBATCH --time=4:00:00
#SBATCH --cpus-per-task=4
#SBATCH --mem=16G
#SBATCH --gres=gpu:1
#SBATCH --output=train_%j.out

module purge
module load 2025/gpu cuda/12.9

# Initialize micromamba for this shell
eval "$(micromamba shell hook --shell bash)"
micromamba activate pytorch-env

cd /tudelft.net/staff-umbrella/<project>/ml-experiment
srun python train.py

Exporting environments

Share your environment with collaborators:

$ micromamba activate pytorch-env
(pytorch-env) $ micromamba env export > environment.yml

Recreate it elsewhere:

$ micromamba create -f environment.yml

Exercise 3: Create a Micromamba environment

  1. Create an environment called sci-env with Python 3.11, numpy, and scipy
  2. Activate it and verify scipy is installed
  3. Export the environment to environment.yml

Part 4: Using modules for pre-installed packages

DAIC provides pre-installed Python packages through the module system. This is the fastest way to get started if the packages you need are available.

Finding available packages

$ module avail py-

---------------------- /cm/shared/modulefiles/2025/cpu ----------------------
py-numpy/1.26.4    py-scikit-learn/1.5.2    py-pandas/2.2.3
py-torch/2.5.1     py-tensorflow/2.18.0     ...

Loading packages

$ module load 2025/gpu
$ module load py-torch/2.5.1
$ module load py-numpy/1.26.4

$ python -c "import torch; print(torch.__version__)"
2.5.1

Combining modules with virtual environments

Use modules as a base and add extra packages:

$ module load 2025/gpu
$ module load py-torch/2.5.1

$ python -m venv /tudelft.net/staff-umbrella/<project>/venvs/custom-env --system-site-packages
$ source /tudelft.net/staff-umbrella/<project>/venvs/custom-env/bin/activate

(custom-env) $ pip install transformers  # adds to module packages
(custom-env) $ python -c "import torch, transformers; print('Both work!')"
Both work!

The --system-site-packages flag gives access to module-installed packages.


Part 5: Real-world ML workflow

Let’s put it all together with a realistic machine learning workflow.

Project structure

ml-project/
├── pyproject.toml      # UV project config
├── uv.lock             # Locked dependencies
├── src/
│   └── train.py        # Training script
├── configs/
│   └── config.yaml     # Hyperparameters
├── jobs/
│   └── train.sh        # Slurm script
└── outputs/            # Results (gitignored)

Create the project

$ cd /tudelft.net/staff-umbrella/<project>
$ uv init ml-project
$ cd ml-project
$ mkdir -p src configs jobs outputs

Add dependencies

$ uv add torch torchvision --index https://download.pytorch.org/whl/cu124
$ uv add numpy pandas matplotlib pyyaml tqdm

Training script

$ cat > src/train.py << 'EOF'
#!/usr/bin/env python3
"""Simple training script demonstrating UV + Slurm workflow."""

import os
import torch
import torch.nn as nn
from torch.utils.data import DataLoader, TensorDataset
from tqdm import tqdm

def main():
    # Check environment
    device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
    print(f"Job ID: {os.environ.get('SLURM_JOB_ID', 'local')}")
    print(f"Device: {device}")

    if torch.cuda.is_available():
        print(f"GPU: {torch.cuda.get_device_name(0)}")

    # Simple synthetic data
    X = torch.randn(1000, 10)
    y = torch.randn(1000, 1)
    dataset = TensorDataset(X, y)
    loader = DataLoader(dataset, batch_size=32, shuffle=True)

    # Simple model
    model = nn.Sequential(
        nn.Linear(10, 64),
        nn.ReLU(),
        nn.Linear(64, 1)
    ).to(device)

    optimizer = torch.optim.Adam(model.parameters(), lr=0.001)
    criterion = nn.MSELoss()

    # Training loop
    epochs = 10
    for epoch in range(epochs):
        total_loss = 0
        for batch_X, batch_y in tqdm(loader, desc=f"Epoch {epoch+1}/{epochs}"):
            batch_X, batch_y = batch_X.to(device), batch_y.to(device)

            optimizer.zero_grad()
            pred = model(batch_X)
            loss = criterion(pred, batch_y)
            loss.backward()
            optimizer.step()

            total_loss += loss.item()

        print(f"Epoch {epoch+1}, Loss: {total_loss/len(loader):.4f}")

    # Save model
    os.makedirs('outputs', exist_ok=True)
    torch.save(model.state_dict(), 'outputs/model.pt')
    print("Model saved to outputs/model.pt")

if __name__ == '__main__':
    main()
EOF

Slurm job script

$ cat > jobs/train.sh << 'EOF'
#!/bin/bash
#SBATCH --account=<your-account>
#SBATCH --partition=all
#SBATCH --time=1:00:00
#SBATCH --cpus-per-task=4
#SBATCH --mem=16G
#SBATCH --gres=gpu:1
#SBATCH --output=outputs/train_%j.out
#SBATCH --error=outputs/train_%j.err

# Clean environment
module purge
module load 2025/gpu cuda/12.9

# Navigate to project
cd /tudelft.net/staff-umbrella/<project>/ml-project

echo "=========================================="
echo "Job started: $(date)"
echo "Job ID: $SLURM_JOB_ID"
echo "Node: $(hostname)"
echo "=========================================="

# Run training
srun uv run python src/train.py

echo "=========================================="
echo "Job finished: $(date)"
echo "=========================================="
EOF

Test locally, then submit

# Quick test on login node (CPU only)
$ uv run python src/train.py

# Submit to cluster for GPU training
$ sbatch jobs/train.sh
Submitted batch job 12345

# Monitor
$ squeue -u $USER
$ tail -f outputs/train_12345.out

Exercise 4: Complete ML workflow

  1. Create the project structure above
  2. Modify the training script to save loss history to a CSV file
  3. Submit a job and verify the output files are created

Troubleshooting

“No space left on device”

Your home directory is full (5 MB limit).

Solution: Move caches to project storage. Add to ~/.bashrc:

export UV_CACHE_DIR=/tudelft.net/staff-umbrella/<project>/.cache/uv
export PIXI_HOME=/tudelft.net/staff-umbrella/<project>/.pixi

“Module not found” in Slurm job

The package works locally but fails in the job.

Causes:

  1. Forgot to use uv run or activate environment
  2. Different working directory
  3. Missing module load

Solution: Always use absolute paths and uv run:

cd /tudelft.net/staff-umbrella/<project>/ml-project
srun uv run python src/train.py

CUDA version mismatch

PyTorch can’t find CUDA or wrong version.

Solution: Match PyTorch CUDA version to the host driver. Check driver version:

$ nvidia-smi | grep "Driver Version"
Driver Version: 550.54.15    CUDA Version: 12.4

Then install matching PyTorch:

$ uv add torch --index https://download.pytorch.org/whl/cu124  # for CUDA 12.4

Slow package installation

Package resolution takes forever.

Cause: Network issues or PyPI server problems.

Solution: UV and Pixi are faster than pip/conda. If still slow, try:

$ uv add package --no-cache  # Skip cache if corrupted

Environment not reproducible

Different results on different machines.

Solution: Always commit lockfiles:

$ git add uv.lock pyproject.toml  # For UV
$ git add pixi.lock pixi.toml     # For Pixi

Exercise 5: Restore from lockfile

  1. Create a UV project and add packages
  2. Delete .venv/ to simulate a fresh clone
  3. Run uv sync to restore the exact environment
  4. Verify packages work

Summary

You’ve learned to manage Python environments on DAIC:

ToolWhen to useKey commands
UVMost projectsuv init, uv add, uv run
PixiConda-forge packagespixi init, pixi add, pixi run
MicromambaGlobal environmentsmicromamba create, micromamba activate
ModulesPre-installed packagesmodule load py-torch/2.5.1

Key takeaways

  1. Use UV for most projects - it’s fast and handles lockfiles automatically
  2. Store everything in project storage - never in /home (5 MB limit)
  3. Commit lockfiles - uv.lock or pixi.lock for reproducibility
  4. Test locally before submitting - catch errors early
  5. Match CUDA versions - module CUDA version must match PyTorch build

Quick reference

# UV workflow
$ uv init myproject && cd myproject
$ uv add torch numpy pandas
$ uv run python train.py

# Pixi workflow
$ pixi init myproject && cd myproject
$ pixi add python pytorch numpy
$ pixi run python train.py

# Micromamba workflow
$ micromamba create -n myenv python=3.11 pytorch
$ micromamba activate myenv
$ python train.py

Next steps