Frequently Asked Question
Getting Started on Arctur2
This tutorial will guide you through your first steps on Arctur-2.
Before proceeding:
- make sure you have an account (if not, follow this procedure), and an SSH client.
- ensure you operate from a Linux / Mac environment. Most commands below assumes running in a Terminal in this context. If you're running Windows, you can use Putty and simillar tools, yet it's probably better that you familiarize "natively" with a Linux-based environment by having a Linux Virtual Machine (consider VirtualBox for that).
Arctur-2 system overview.
Firstly, we will take a quick look how Arctur-2 is organized and go over the general specifications.
Currently, we have 3 different partitions:
- compute
- gpu
The ' compute' partition is made out of 14 nodes, with the names node01 to node14. This is the default partition and your jobs will run on it if not specified otherwise.
Each of these nodes have 2 Intel Xeon E5-2690v4 processors, together having 28 cores clocked at 2.60 GHz. Every node also has 512GB of fast DDR4 RAM. They also have 480GB of local SSD storage.
The ' gpu' partition consists of 8 nodes (gpu01 to gpu08). The only difference with compute nodes is that they each have 4 NVIDIA Tesla M60 GPUs. The Tesla M60 is a very powerfull GPU as it is made out of two physical NVIDIA Maxwell GPUs with combined 16GB of memory. Many aplications perceive the card as two separate GPUs, essentially having 8 GPUs per node.
Your home folder is located on a shared NFS, which consists of SSD cached HDD disks. The default limits are 100GB per user, but if you need more feel free to contact support.
All the nodes and the filesystem are interconnected with a 2x25GbE connection.
SLURM basics
Before we dive in into SLURM, it would be good to consult the official quickstart guide: https://slurm.schedmd.com/quickstart.html
We are going to explain and show how to use SLURM and some of our custom tools.
SLURM is an open source, fault-tolerant, and highly scalable cluster management and job scheduling system for large and small Linux clusters. It is used on Arctur-2.
- It allocates exclusive access to the resources (compute nodes) to users during a job or reservation so that they can perform they work
- It provides a framework for starting, executing and monitoring work
- It arbitrates contention for resources by managing a queue of pending work
- it permits to schedule jobs for users on the cluster resource
Commonly used SLURM commands
sacct is used to report job or job step accounting information about active or completed jobs.
salloc is used to allocate resources for a job in real time. Typically this is used to allocate resources and spawn a shell. The shell is then used to execute srun commands to launch parallel tasks.
srun is used to run a parallel job, it will also first create a resource allocation if necessary.
There are two types of jobs:
- interactive: you get a shell on the first reserve node
- passive: classical batch job where the script passed as argument to sbatch is executed
We will now see the basic commands of Slurm.
Connect to Arctur2. You can request resources in interactive mode like this:
$ srun --pty bash
You should now be directly connected to the node you reserved with an interactive shell. Keep in mind that only you have access to the node, and it will be billed as you are running a job. Now exit the reservation:
$ exit # or CTRL-D
When you run exit, you are disconnected and your reservation is terminated (billing stops).
Currently, there are not time limits enfored on the reservetions or jobs.
To run a passive job, use srun or sbatch.
One example is the following:
$ srun -N2
If you use the command 'hostname' (which prints the hostname of the host it is running on) on two nodes, you should see which nodes were allocated to you, this should be a very short job.
Be sure to check out all optional arguments srun can take by typing in 'man srun' or by looking at the official documentation on https://slurm.schedmd.com/srun.html
Using srun like this will give the job output in you terminal session, and you can't really do anything else in that session until the job is done. A better approach for submitting jobs is to use sbatch.
The command sbatch takes a batch script as an argument and submits the job. In the script you specify all options such as the partition you want resources from, the number of nodes and similar.
An example of a simple batch script which runs a command (for example the command 'hostname') on 2 compute nodes is this:
#!/bin/bash -l #SBATCH --account=EXAMPLE #SBATCH --partition=compute #SBATCH --nodes=2 #SBATCH --ntasks-per-node=1 #can be up to 24 #SBATCH --time=00:20:00 #SBATCH --job-name=my_job srun
Using your favourite text editor, save this as myjob.sh and use sbatch to run it:
$ sbatch myjob.sh
You will get a jobID back from sbatch, which you can use to control your job (we cover this later). Unless specified otherwise, the output sill be stored in a text file in the same folder in which the script is.
Running jobs on different partitions
Just define the partition name in the appropriate place in the job submission script like this:
#SBATCH --partition=gpu
As shown before, available partitions are: compute and gpu.
In order to run a job on the GPUs, you also need to add this line, specifying the amount of GPUs that you will need
#SBATCH --gres=gpu:8
This will get you an allocation of 8 GPUs, the maximum number.
Job management
To check the state of the cluster (idle and allocated nodes) run:
$ sinfo
This is useful to see the state of the resources, and how many are available to you immediately. All the idle nodes are ready for use. If you need more nodes than currently available(if some other jobs are running in the system), just submit your job and it will wait in queue until requested resources are available.
Sometimes, we will run our internal low priority jobs on the cluster too. They will run in a low priority queue, and will be suspended when you start your jobs. Unfortunately, with ' sinfo' you won't be able to determine how many nodes run low priority jobs. For that, we have developed another tool called 'savail'. Try it and check if there are any nodes running low priority jobs:
$ savail
You can check the status of your(nad only your) running jobs using squeue command:
$ squeue
Then you can delete your job by running the command:
$ scancel JOBID
You can see your system-level utilization (memory, I/O, energy) of a *running* job using:
$ sstat
In all remaining examples of reservation in this section, remember to delete the reserved jobs afterwards (using scancel or CTRL-C)
Pausing, resuming jobs
To stop a waiting job from being scheduled and later to allow it to be scheduled:
$ scontrol hold $ scontrol release
To pause a running job and then resume it:
$ scontrol suspend $ scontrol resume
For obvious reasons non-root users have only a subset of all SLURM commands available for them to use (most of them will work fine, but display only data for the user who runs it).