Skip to content

Slurm

SLURM (Simple Linux Utility for Resource Management) is a cluster management and job scheduling system used in high-performance computing (HPC) clusters. This manual covers the basic commands for submitting and managing jobs in our SLURM environment.

Basic Commands

  • sinfo: Displays cluster nodes status
  • sbatch: Run a job in batch mode
  • srun: Executes an interactive job or a job step within a script/sbatch job
  • squeue: Shows active and queued jobs
  • scancel: Cancel a job
  • sacct: Shows job history

Accounting

When you launch jobs with Slurm (sbatch, salloc, srun) they will be charged to your default billing account aka Slurm account. You can check which Slurm account your project is associated to like this:

sacctmgr show user $USER withassoc format=User,Account%30,DefaultAccount%30

Example output
[user@c3.uc3m.es ~]# sacctmgr show user $USER withassoc format=User,Account%30,DefaultAccount%30
    User                        Account                       Def Acct 
---------- ------------------------------ ------------------------------ 
user                   cuentadepruebas                cuentadepruebas 

Common pitfalls

If the name of the Slurm account is too long it could be displayed with a + sign at the end, like this:

[user@c3.uc3m.es ~]$ sacctmgr show user $USER withassoc format=User,Account,DefaultAccount
      User    Account   Def Acct 
---------- ---------- ---------- 
   user    cuentadep+ cuentadep+ 
In this example, cuentadep+ is not a valid Slurm account! If you try to launch jobs with cuentadep+ you will get an error. The correct account is cuentadepruebas (the full name).

To ensure that you get the full account name, specify a longer length for the account, e.g. 40 characters (Account%40) until you don’t see a + sign at the end:

sacctmgr show user $USER withassoc format=User,Account%40
sacctmgr show user $USER withassoc format=User,Account%<number of characters>

Multiple Slurm Accounts

If you are part of more than one project you will have one user login with access to multiple Slurm accounts. Let us take Bob for an example: he is a researcher working on 2 different projects, Project-Apples and Project-Oranges. Bob has a single user login (bob) with access to 2 Slurm accounts: project_apples and project_oranges. When Bob works on Project-Apples he should use the corresponding project_apples Slurm account, like this:

srun -A project_apples --pty bash

When you launch jobs please make sure to assign them to the correct Slurm account with the -A or --account options.

Ease of use

If you only have one Slurm account we recommend that you add these lines at the end of your ~/.bashrc. Every time you log in this will export an environment variable that contains your account name, so that you can reference it easily. Just replace <slurm_account> with your full Slurm account name.

# set up Slurm Account
export SLURM_BILLING_ACCOUNT=<slurm_account>

Billing for Exclusive Jobs

If you submit jobs with the --exclusive flag, you will be billed for all node resources (all CPU cores and all GPUs), even if you specify a limited number of cores/GPUs.

Launch Jobs

There are several ways to launch jobs in Slurm. Here we cover the two main approaches.

Interactive Jobs

Interactive mode is useful for tests, development or if you just need to work directly on a compute node.

Example 1. Simple interactive session: Requests default resources and opens a bash shell in the allocated compute node

srun --pty bash

Example 2. Specify resources: This requests 1 node, 4 processes (cores), 1 hour runtime and 32GB of RAM

srun --pty --nodes=1 --ntasks=4 --time=01:00:00 --mem=32G bash
For more advanced customization please check the official Slurm documentation for srun.

Batch Jobs

Ideal for production workloads or long jobs. We have a script that contains all job instructions.

Example sbatch script (my_job.sbatch).

#!/bin/bash

#SBATCH --job-name=my_job
#SBATCH --output=output_%j.out
#SBATCH --error=error_%j.err
#SBATCH --ntasks=4
#SBATCH --time=02:00:00
#SBATCH --mem=32G

# load modules (optional)
module load python/3.10

# run the program
python my_script.py

For more advanced customization please check the official Slurm documentation for sbatch.

Monitor Jobs

Display all running/queued jobs

squeue

Display only your jobs

squeue -u $USER

Check job details

scontrol show job <job_id>

Cancel job

scancel <job_id>

List job history

sacct -u $USER

Additional Documentation

If you need help with something that is not covered in this guide please check the official Slurm Documentation.