Run Programs
Table of Contents
- Introduction
- Modules - Manging your environment
- Running a serial program
- Running an MPI program on interactive nodes
- Running an MPI program on dedicated nodes
- Finding more information
- Batch script files
Introduction
In this document we will describe how to compile and run both a serial and a parallel program on the IPF, Itanium Processor Family, cluster Lucidor at PDC.Lucidor can be viewed as a cluster of workstations but with the difference that the network connecting the individual work stations is significantly faster than on a normal work station cluster. When a user connects to the cluster the user is connected to one of the available login nodes (work stations). On this node the user can compile his/her programs, prepare input files and all the other tasks that can be accomplished on any kind of Unix workstation. Note however that the login node is shared among all users logged into the system at the time so if a user starts to run a computing and/or I/O intensive program all login-users will suffer from the load on the login-node. Hence there may be limits on memoryuse and cputime set on this node. (To find the current setting use the command limit if your shell is tcsh or the command ulimit -a if your shell is bash.) The name of the login node is blumino.pdc.kth.se.
Before running a program on Lucidor you will have to allocate a "resource" in the system to run on. I.e. you will need to "book" one or more nodes on the system to run on. You can book two types of resources:
- Interactive nodes
- Interactive nodes are shared pretty much in the same way as the login-node is. This means that at any one moment more than one user is running on each interactive node and also when you allocate several interactive nodes to run a parallel program you will find that you are running several instances of your job on each interactive node. I.e. the interactive nodes are used as pseudo-nodes. A five way parallel job when run on interactive nodes may actually run as 3 instances on interactive node no. 1 and two instances on interactive node no. 2. Interactive nodes are intended for short tests.
- Dedicated nodes
- Dedicated nodes or batch nodes are used as unique nodes in the parallel execution. On a dedicated node you are guaranteed to be the only user on the node. Commonly you will run one or two instance(s) of your parallel program on each dedicated node you have allocated giving possibility to use 100% of the computing power of each node to your program. (There are 4 CPUs per node. Note, however, that they share the same memory bus.)
Modules - Managing your environment
Adding software to your executable path is made through the use of modules.- module add i-compilers
- This module adds the current version of the Intel compilers to your executable path. You need to have valid tickets when loading this module.
- module add mpi
- This module adds the current default MPI implementation to your executable path. As of 2008-06-16, it is mpichmx/1.2.7..5-intel.
- module add easy
- This module adds the commands necessary to interact with the queuing system to your executable path
Running a serial (non MPI) program
A very simple serial Fortran90 example code may look like the example1.f code. As stated above, programs should normally not be run on the login node but rather on one of the interactive nodes.
Running a serial program on interactive nodes
- To compile and execute the program on an interactive node do the following:
- > module add i-compilers
> ifort -FR -o example1 -O2 example1.f
> ./example1
Number of iterations: 14000 Result: 13994.0905473407
- To find the names of the available interactive nodes do
- > module add easy
> spusage | grep interactive
Running a serial program on dedicated nodes
- To compile and submit the program do the following on the login node:
- > module add i-compilers easy
> ifort -FR -o example1 -O2 example1.f
(Note that if you do not have module add i-compilers in a suitable login file, you need to submit a script which first load the module i-compilers and then run the program.)
> esubmit -n1 -t5 ./example1
> spq
Q JID USER STATE CAC RESOURCE TIME - 061609400454 ulfa run - 1A 2003-06-16 21:40:00 1 061612293392 ulfa wait ta.ulfa 1A 0h05
Note that the job does not appear in the queue immediately. It takes a couple of seconds. The output of a program submitted this way appears in an e-mail. You can monitor the state of your jobs with
- > watch -n10 spq -u $USER
Running an MPI program on interactive nodes
Running an MPI program using spattach -i
The example1 code used above is easily parallelized. An example is shown in example2.f.
- To compile the program, connect to 3 virtual nodes and execute the program do:
- > klist -Tf
Credentials cache: FILE:/tmp/krb5cc_22557 Principal: smeds@NADA.KTH.SE Issued Expires Flags Principal Aug 17 17:48:48 Aug 18 03:48:48 FI krbtgt/NADA.KTH.SE@NADA.KTH.SE [...]Check that you have the "F" flag set above. If not, execute the command kinit -f> module add i-compilers mpi easy
> spattach -i -p3
> mpif90 -FR -o example2 -O2 example2.f
> mpirun -nolocal -np $SP_PROCS -machinefile $SP_HOSTFILE ./example2Host number: 0 (h01n07-e.pdc.kth.se) Number of iterations: 4668 Result: 4615.8663 Host number: 2 (h01n07-e.pdc.kth.se) Number of iterations: 4666 Result: 4613.1696 Host number: 1 (h01n07-e.pdc.kth.se) Number of iterations: 4666 Result: 4613.1696 -------------------------------- Host number: 0 Total number of iterations: 14000 Result: 13842.2056029682
> exit
An example of a batch script file for a MPICH program is /afs/pdc.kth.se/misc/pdc/mpich/mpich.lxl used above. The file is located in AFS on PDC systems. This script will spawn num number of MPI processes per node. The name of the MPI program and its arguments are stated on the submit line.
- Submit by
- > esubmit -n3 -t15 -c MyUserCAC ./mpich.lxl [-p
num] mpiprogram [arg1 arg2 ...]
For most programs this script will be enough. However in some cases you may need to make your own expanded version of this script. If you do, we recommend that you test it on the interactive nodes first. Test to run N nodes, with num processes per node with:
- > spattach -i -p N
> ./my_mpich.lxl [-p num] mpiprogram [arg1 arg2 ...]
If it does what you want you may then finally do a dedicated attach (spattach -pN) and test your script as shown below. Once you get the prompt back from spattach:
- Check the JID of your session spq -u $USER
- Login on the first of the nodes you have been allotted.
- Do an spattach -j JID
- Execute your script and see if it seems to do the right thing for your application.

