Job submission

How to submit jobs to slurm?

Slurm job’s terminology: job, job step, task and CPUs

A slurm job (submitted via sbatch) can consists of multiple steps in series. Each step (specified via srun) can run multiple tasks (ie programs) in parallel. Each task gets its own set of CPUs. As an example, consider the workflow and corresponding breakdown shown in fig 2.

Slurm job’s terminology

Slurm job’s terminology

In this example, note:

  • When you explicitly request 1 CPU per task (--cpus-per-task=1), you should also explicitly specify the number of tasks (--ntasks). Otherwise, srun may start the task twice in parallel (because CPUs are allocated in multiples of 2)
  • The default slurm allocation is a single task and single CPU (ie --ntasks=1 --cpus-per-task=1). Thus, it is not necessary to explicitly request these to run a single task on a single CPU.
  • When using multiple tasks, specify --mem-per-cpu.

Priorities and waiting times

How to submit jobs to slurm?

Quality of Service (QoS)

How to submit jobs to slurm?

Partitions

How are partitions managed on DAIC?

Interactive jobs

How to submit jobs to slurm?

Submitting jobs

How to submit jobs to slurm?

Monitoring jobs

How to cancel/stop scheduled or running jobs?

Cancelling jobs

How to cancel/stop scheduled or running jobs?

Using graphic cards

How to use the compute node’s GPU?

Job arrays

How to submit jobs to slurm?

Job chains

How to submit jobs to slurm?

Reservations

How to submit jobs to slurm?

Kerberos

How to submit jobs to slurm?