You are here: Home Resources Computers Historical Computers at PDC Ferlin How to

Run Programs

Compile and run programs on Ferlin.

Different types of nodes

When a user connects to the cluster the user is connected to one of the available login nodes. On a login node the user can compile his/her programs, prepare input files and all the other tasks that can be accomplished on any kind of Unix workstation.

Note however that the login node is shared among all users logged into the system at the time so if a user starts to run a computing and/or I/O intensive program all login-users will suffer from the load on the login-node. Hence there may be limits on memory use and cpu time set on this node. To find the current setting use the command limit if your shell is tcsh or the command ulimit -a if your shell is bash.) The name of the login node is ferlin.pdc.kth.se.

Before running a program on Ferlin you will have to allocate a "resource" in the system to run on. I.e. you will need to "book" one or more nodes on the system to run on. You can book two types of resources:

Interactive nodes

Interactive nodes are shared pretty much in the same way as the login-node is. This means that at any one moment more than one user is running on each interactive node and also when you allocate several interactive nodes to run a parallel program you will find that you are running several instances of your job on each interactive node. I.e. the interactive nodes are used as pseudo-nodes. A five way parallel job when run on interactive nodes may actually run as 3 instances on interactive node no. 1 and two instances on interactive node no. 2. Interactive nodes are intended for short tests.

Dedicated nodes

Dedicated nodes, or batch nodes, are used as unique nodes in the parallel execution. On a dedicated node you are guaranteed to be the only user on the node. Commonly you will run up to 8 instance(s) of your parallel program on each dedicated node you have allocated giving possibility to use 100% of the computing power of each node to your program. This is the type of node you should use for your production jobs.

Modules

Adding software to your executable path is made through the use of modules.

module add i-compilers
This module adds the default version of the Intel compilers to your executable path. You need to have valid Kerberos tickets when loading this module.
module add mpi
This module adds the default MPI implementation to your executable path.
module add easy
This module adds the commands necessary to interact with the queuing system to your executable path.

Running a serial (non-MPI) program

A very simple serial Fortran90 example code may look like the example1.f code. As stated above, programs should normally not be run on the login node but rather on one of the interactive or dedicated nodes.

Running a serial program on interactive nodes

To find the names of the available interactive nodes do

module add easy
spusage | grep interactive 

on the login node.

To compile and execute the program on an interactive node, login to the node and do the following:

module add i-compilers
ifort -FR -o example1 -O2 example1.f
./example1
 Number of iterations:       14000   Result:    13989.2242459837

The option -FR is needed to tell the compiler that the file is free format Fortran90. The compiler defaults to assume that files with the extension .f are fixed format Fortran90 and .f90 are free format Fortran90.

Running a serial program on dedicated nodes

To compile and submit the program do the following on the login node:

module add i-compilers easy
ifort -FR -o example1 -O2 example1.f
esubmit -n1 -t5 ./example1
spq -u $USER
  Q          JID  USER     STATE    CAC           RESOURCE TIME               
  - 072113320718  jrydh    run      staff.debug         1A 2008-07-21/15:38:08

Note that the job does not appear in the queue immediately. It takes a couple of seconds. The output of a program submitted this way is posted in two e-mails sent to your PDC username @pdc.kth.se. These two e-mails state that: 1) you have been allocated a node, 2) a display of the job output. While waiting for you submitted job to start, you can monitor its state using the command:

watch -n10 spq -u $USER

You will see a list of your jobs. The list will automatically refresh every 10 seconds. Stop monitoring your jobs by pressing Control-C on the keyboard.

 

Running an MPI program on interactive nodes

Running an MPI program using spattach -i

The example1 code used above is easily parallellized. An example is shown in example2.f. First, check that you have forwardable Kerberos tickets:

module add heimdal
klist -Tf
Credentials cache: FILE:/tmp/krb5cc_21641
        Principal: username@NADA.KTH.SE
  Issued           Expires        Flags    Principal
Jul 21 15:20:56  Jul 23 11:13:10  Ff     krbtgt/NADA.KTH.SE@NADA.KTH.SE
[...]

Check that you have the "F" flag set above. See Kerberos instructions. Compile the program, connect to 3 virtual nodes and execute the program:

module add i-compilers mpi easy
spattach -i -p3
mpif77 -FR -o example2 -O2 example2.f
mpirun -np $SP_PROCS -machinefile $SP_HOSTFILE ./example2
 Host number:           0  (a02c31n11.pdc.kth.se)  Number of iterations: 4666   Result: 4644.16774464921
 Host number:           2  (a02c31n03.pdc.kth.se)  Number of iterations: 4666   Result: 4644.16774464921
 Host number:           1  (a02c31n14.pdc.kth.se)  Number of iterations: 4668   Result: 4646.47446958455
 --------------------------------
 Host number:           0                 Total number of iterations: 14000  Result: 13934.8099588830
exit

Note: actual output may be a bit more garbled. The result differs slightly from the serial example since the sequence of random number becomes different in the two examples.

 

Running an MPI program on dedicated nodes

Running an MPI program using spattach

The procedure of a non-interactive spattach is almost identical to the procedure described above for interactive (shared) spattach. The main difference here is the following:

  • You are guaranteed to be the only user of the nodes.
  • You will have to wait for your nodes since dedicated nodes are allocated via the queuing system (EASY).
  • The node hours allocated will be charged to a CAC (time allocation). The time measured is wall-clock time. If you belong to only one CAC, you need not specify it. To see which CACs you belong to do: cac members $USER
  • You must specify the time period for which you will use the nodes. This is required for the queuing system to be able to find a time slot in the machine that fits your request.

As with interactive spattach, you need to have module add i-compilers loaded when running MPI programs.

To connect to 3 dedicated nodes and execute the program do:

module add mpi easy
spattach -p3 -t15 -c MyUserCAC
mpif77 -FR -o example2 -O2 example2.f
mpirun -np $SP_PROCS -machinefile $SP_HOSTFILE ./example2
 Host number:           0  (a02c31n13.pdc.kth.se)  Number of iterations: 4668   Result:    4646.47446958455
 Host number:           1  (a02c31n14.pdc.kth.se)  Number of iterations: 4666   Result:    4644.16774464921
 Host number:           2  (a03c01n03.pdc.kth.se)  Number of iterations: 4666   Result:    4644.16774464921
 --------------------------------
 Host number:           0   Total number of iterations:       14000   Result: 13934.8099588830

An alternative to the above is to use a script file to start up the MPI program. A small example script myjob.sh could be designed as:

#!/bin/bash
module add mpi
processes_per_node=8
total_processes=`expr $processes_per_node \* $SP_PROCS`
PRG="$1"
shift
ARGS="$*"
mpirun -np $total_processes -machinefile $SP_HOSTFILE $PRG $ARGS

You then run your job with:

./myjob.sh ./example2
 Host number:           6  (a02c31n03.pdc.kth.se)  Number of iterations:         583   Result:    602.779363737555
 Host number:           1  (a02c31n11.pdc.kth.se)  Number of iterations:         583   Result:    602.779363737555
 Host number:          10  (a02c31n11.pdc.kth.se)  Number of iterations:         583   Result:    602.779363737555
 Host number:          19  (a02c31n11.pdc.kth.se)  Number of iterations:         583   Result:    602.779363737555
 Host number:           5  (a02c31n14.pdc.kth.se)  Number of iterations:         583   Result:    602.779363737555
 Host number:           8  (a02c31n14.pdc.kth.se)  Number of iterations:         583   Result:    602.779363737555
 Host number:          17  (a02c31n14.pdc.kth.se)  Number of iterations:         583   Result:    602.779363737555
 Host number:          23  (a02c31n14.pdc.kth.se)  Number of iterations:         583   Result:    602.779363737555
 Host number:          20  (a02c31n14.pdc.kth.se)  Number of iterations:         583   Result:    602.779363737555
 Host number:           0  (a02c31n03.pdc.kth.se)  Number of iterations:         591   Result:    611.256109745599
 Host number:          14  (a02c31n14.pdc.kth.se)  Number of iterations:         583   Result:    602.779363737555
 Host number:           9  (a02c31n03.pdc.kth.se)  Number of iterations:         583   Result:    602.779363737555
 Host number:          11  (a02c31n14.pdc.kth.se)  Number of iterations:         583   Result:    602.779363737555
 Host number:          21  (a02c31n03.pdc.kth.se)  Number of iterations:         583   Result:    602.779363737555
 Host number:           7  (a02c31n11.pdc.kth.se)  Number of iterations:         583   Result:    602.779363737555
 Host number:          13  (a02c31n11.pdc.kth.se)  Number of iterations:         583   Result:    602.779363737555
 Host number:          22  (a02c31n11.pdc.kth.se)  Number of iterations:         583   Result:    602.779363737555
 Host number:           4  (a02c31n11.pdc.kth.se)  Number of iterations:         583   Result:    602.779363737555
 Host number:          12  (a02c31n03.pdc.kth.se)  Number of iterations:         583   Result:    602.779363737555
 Host number:          18  (a02c31n03.pdc.kth.se)  Number of iterations:         583   Result:    602.779363737555
 Host number:          15  (a02c31n03.pdc.kth.se)  Number of iterations:         583   Result:    602.779363737555
 Host number:           3  (a02c31n03.pdc.kth.se)  Number of iterations:         583   Result:    602.779363737555
 Host number:           2  (a02c31n14.pdc.kth.se)  Number of iterations:         583   Result:    602.779363737555
 Host number:          16  (a02c31n11.pdc.kth.se)  Number of iterations:         583   Result:    602.779363737555
 --------------------------------
 Host number:           0   Total number of iterations:       14000   Result:    14475.1814757094
exit

As you can see the random sequence is the same in all nodes except the master node. In a real life application this is usually not desired. Typically you either generate the whole sequence on one node, or you use a well-behaved parallel random number generator.

An important note about the procedure above is that the shell opened by spattach runs at the node where you executed the spattach. That is typically the login node of the system. The environment in this shell however is such that parallel (MPI) programs will execute on the nodes allocated from the system.

In the example above, we used all 8 cores on the nodes. It is important to know that memory bandwidth can be saturated even when using much fewer cores.

It is possible to reserve dedicated nodes in advance for interactive use:

esubmit -n6 -t5 -T 2008-07-21/15:26:20
spattach -j JID

Note that this only guarantee that you will not receive the nodes prior to the requested time. Whether you receive them at the requested time or later depend on your place in the queue.

Running an MPI program in batch mode

The difference between an spattach onto dedicated nodes and an esubmit is that the esubmit command is never reading any key stroke input. Instead it assumes all its input (if there is any) to come from a batch-script. In our simple example there is no input at all and running the program in batch mode is as easy as the following example.

To run the example program in batch

module add mpi easy
mpif90 -FR -o example2 -O2 example2.f
esubmit -n3 -t15 -c MyUserCAC ./myjob.sh ./example2
Wait for about 20-30 seconds
spq -u $USER

You must edit your script myjob.sh to make sure the number of processes per node is correct

since there is no option for the example script given above that controls the number of processes per node. Ferlin has two sockets per node with 4 cores each. Hence it is not recommended to have more than eight processes per node. Change your myjob.sh to fit your problem description!

 

You will receive two letters from the EASY scheduling system for each job. One at the moment your program starts and one at the end of you run containing the output of your execution. It is vital that you have arranged for e-mail forwarding to your home institution so that you will see these e-mails (see the Arranging for e-mail forwarding document for instructions).

It is also important that at the time you submit your job you have forwardable Kerberos tickets valid for the entire execution of your program (i.e. queuing time + execution time). See Kerberos instructions. The major differences when using esubmit and a dedicated spattach are:

  • The programs run with esubmit will be spawned from the first host among your allocated hosts and not from the login node.
  • Programs run with esubmit don't have access to an interactive user. You can't read interactive input from a terminal and the like. However, the program has access to the STDIN device.
  • When using esubmit the queuing system will automatically launch your submitted script (or program). You will not need to be present to start your analysis as you need to with spattach.

More information about EASY

More information in using commands to interact with the queuing system EASY is available on-line.

spattach -h
esubmit -h 
esubmit -h -v
spq -h
spstatus -h
sprelease -h
spfree -h
spsummary -h
spjobsummary -h

Batch script files

Script usable for spattach -i, spattach and esubmit

Often many things needed to be done before actually executing your program. Perhaps you'd like to specify the communication procedure, set up some input files, move files into scratch directories and the like before your program starts. Likewise you may want to do some clean up after your program is finished.

This is done by using a batch script file that is submitted to EASY, the scheduling system. Submit by

esubmit -n3 -t15 -c MyUserCAC ./myjob.sh ./someprogram [arg1 arg2 ...]

For most programs this script will be enough. However in some cases you may need to make your own expanded version of this script. If you do, we recommend that you test it on the interactive nodes first.

spattach -i -p N
./myjob.sh mpiprogram [arg1 arg2 ...]

If it does what you want you may then exit and finally do a dedicated attach (spattach -pN) and test your script as shown below. Once you get the prompt back from spattach:

  1. Check the JID of your session spq -u $USER
  2. Login on the first of the nodes you have been allotted.
  3. Do an spattach -j JID
  4. Execute your script and see if it seems to do the right thing for your application.
Filed under: ,