First GPU Job

Submit your first GPU job on DAIC.

Before you begin

Complete First Job to understand batch job basics.

Submit a GPU job

1. Create a test script

gpu_test.py

import torch

print(f"CUDA available: {torch.cuda.is_available()}")
print(f"GPU count: {torch.cuda.device_count()}")
if torch.cuda.is_available():
    print(f"GPU name: {torch.cuda.get_device_name(0)}")

2. Create the batch script

gpu_job.sh

#!/bin/bash
#SBATCH --account=<your-account>
#SBATCH --partition=all
#SBATCH --time=0:10:00
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=4
#SBATCH --mem=8G
#SBATCH --gres=gpu:1
#SBATCH --output=gpu_%j.out

module purge
module load 2025/gpu
module load py-torch/2.5.1

srun python gpu_test.py

3. Submit and check output

sbatch gpu_job.sh
> Submitted batch job 301

cat gpu_301.out
> CUDA available: True
> GPU count: 1
> GPU name: NVIDIA L40

Request specific GPU types

To request a specific GPU type:

#SBATCH --gres=gpu:l40:1      # NVIDIA L40
#SBATCH --gres=gpu:a40:1      # NVIDIA A40

Next steps

  • Use Containers for custom GPU environments
  • Learn about Modules for software management