Here’s a simplified workflow of booking a node.
Booking a node might be suitable if you want to test and verify your code in a parallell environment, or when the code is not time consuming but frequent adjustment is needed.
Compute nodes can be booked from the queue system for interactive use. This means that you can run your program similar to how you run it on a local computer through terminal.
Booking an interactive node can be useful when you want to test, verify or debug your code in a parallell environment. It’s also suitable when the program is not time consuming but is in need of frequent adjustment. For a large scale program we recommend Queueing jobs instead, since waiting for an interactive node booked with large amount of run time can take a lot of time.
The command to book an interactive node is salloc
salloc --nodes=2 -t 1:00:00 -A <project>
The above command would then book two nodes for one hour. The -A <project> should be the time allocation you are a part of. The
-A flag must be specified, or the command will receive an error.
Job submit/allocate failed: Invalid partition or qos specification
You can check the time allocations you are a member of with
Depending on how busy the supercomputer is, it might take a while before the interactive node is booked. The terminal would then be loading while it waits in the queue. When an node is ready you will recieve a message like
salloc: Granted job allocation <jobid>
When a node is booked, the program must be run with specific cluster commands. Commands are detailed below for our current clusters at PDC.
On some of our cluster you can also login directly to the interactive compute node from a separate window. After login you can run the software you like normally. The name for the booked node be found using
If you’re running a specific software, please see the Accessing software.
Keep in mind that after salloc you’re still in the Login Node! if you execute a program without the specific commands the program will be running in the login node!
Keep in mind that the node is booked as long as you have not shut down the terminal you typed salloc, or typing the
exit command, or the time runs out.
Running on Beskow¶
The standard value of
salloc in Beskow is 1 hour and 4 nodes.
The maximum duration for a job on Beskow is one day. Any value for
-t more than
24:00:00 will not be possible.
For Beskow a job can be started using srun. Commands like aprun and mpirun are not available on the system.
srun -n 64 ./program
check the manual page of srun for more details about flags and options with
Running on Tegner¶
The standard value of
salloc in Tegner is (loosely speaking) Infinity and 1 node. Since the queueing system prioritise smaller running time, this usually means that you will never get an interactive node. Therefore always specify the
For tegner, the jobs can be started using mpirun.
module add intelmpi/5.0.3 mpirun -np 48 ./program
On this cluster you can also login directly into the compute node when running interactively, as described above.