Job Submission
Queuing system
On Lindgren we use moab as scheduler and torque as resource manager (often called batch-system). You should get the proper modules to torque loaded at log in.
Commands
- Documentation
man pbs
man qstat
and similar for other commands. - Submit a job
qsub ./jobscript.pbs
qdel jobid
apkill 1268102
qstat -u userName, qstat -Qf showq
These commands give information on the current status of the queue and basic information on the jobs in the queue.
showbf
This command shows the current free nodes
xtnodestat apstat
These are Cray specific commands that show how many nodes each job is using and where each job is running within the Cray
checkjob <job id number>
This will give detailed information on a job, which time allocation it is charged to and so on.
Queue priority
Learn more about how jobs are prioritized at Lindgren and the projinfo command
that shows how much time you have used of your time allocations on Lindgren.
Running interactively
There are eight compute nodes available for interactive use. The login node should NEVER be used to run interactive jobs, always run them on the interactive nodes.
Be advised that you should only run processes of maximum one hour in duration on these interactive nodes! For longer jobs, please use the dedicated nodes.
Note mpi jobs will fail if you try to run them on the login node with an error like
Fatal error in MPI_Init: Other MPI error, error stack: MPIR_Init_thread(408): Initialization failed MPID_Init(123).......: channel initialization failed MPID_Init(461).......: PMI2 init failed: 1
To launch a job on an interactive node use:
aprun
The aprun command accepts the -t option which limits the amount of time your job will run for e.g.
aprun -n 24 -t 3600 program.x
which will limit the program to running for 1 hour of CPU time.
Try the command:
xtnodestat | grep inter
to find the number of available interactive nodes.
The number of free nodes is shown
Available compute nodes: 6 interactive, 52 batch
If there are not enough free interactive nodes to fulfill your request then your job will be held in a simple queue. The aprun command will only finish when the job has completed. To see the status of your interactive job you can use the command
apstat
Running on dedicated nodes
Most jobs should be run on dedicated nodes through the queue system. Jobs must be run from somewhere within Lustre, in general:
/cfs/klemming/scratch/y/yourUsername
Also remember that the wallclock limit for jobs is 24 hours on Lindgren and that jobs that are less than one hour long have priority.
Dedicated nodes are accessed through the queue system with the command:
qsub ./jobscript.pbs
where jobscript.pbs is a PBS script as descibed below.
Job scripts
In a job script several PBS environment variables can be defined:
- #PBS -N job_name - the job name is used to determine the name of job output and error files
- #PBS -l walltime=hh:mm:ss - maximum job elapsed time should be indicated whenever possible: this allows PBS to determine best scheduling startegy. Current maximum is 24 hours.
- #PBS -l mppwidth=n - Number of processes (MPI rank) that will be spawned for the given job. Should always be set to a multiple of 24 (number or cores on one node).
- #PBS -e error_file.e - job error file
- #PBS -o output_file.o - job output file
Note the #PBS -M name@machine.name which should email you when the job starts and finishes is NOT currently enabled so will not work.
Example 1
An example of a job script for a MPI program (using mppwidth parameter to define the number of MPI processes).
# The name of the script is myjob #PBS -N myjob # Only 1 hour wall-clock time will be given to this job #PBS -l walltime=1:00:00 # Number of cores to be allocated is 288. # always ask for complete nodes (i.e. mppwidth should normally # be a multiple of 24) #PBS -l mppwidth=288 #PBS -e error_file.e #PBS -o output_file.o # Change to the work directory cd $PBS_O_WORKDIR # Run the executable named myexe # and write the output into my_output_file aprun -n 288 ./myexe > my_output_file 2>&1
Example 2
An example for a Hybrid MPI+OpenMP program (using mpp* parameters to define the number of MPI processes). This example will pack 24 processes on each compute node.
# The name of the script is myjob #PBS -N myjob # Only 1 hour wall-clock time will be given to this job #PBS -l walltime=1:00:00 # Number of MPI tasks. # always ask for complete nodes # in this case as we have 4 tasks per node mppwidth should # be a multiple of 4 #PBS -l mppwidth=1024 # Number of MPI tasks per node #PBS -l mppnppn=4 # Number of cores hosting OpenMP threads #PBS -l mppdepth=6 #PBS -e error_file.e #PBS -o output_file.o # Change to the work directory cd $PBS_O_WORKDIR export OMP_NUM_THREADS=6 # Run the executable named myexe # and write the output into my_output_file aprun -n 1024 -N 4 -d 6 ./myexe > my_output_file 2>&1
Example 3
If you want to run an application such as gromacs or other code for which there is a module you need to add an extra line at the start of the job script
# The name of the script is myjob #PBS -N myjob # Only 1 hour wall-clock time will be given to this job #PBS -l walltime=1:00:00 # Number of MPI tasks. # always ask for complete nodes (i.e. mppwidth should normally # be a multiple of 24 ) #PBS -l mppwidth=960 #PBS -e error_file.e #PBS -o output_file.o # Change to the work directory cd $PBS_O_WORKDIR #enable modules within the batch system . /opt/modules/default/etc/modules.sh #load the gromacs module module add gromacs # Run and write the output into my_output_file aprun -n 960 mdrun -s job.tpr > my_output_file


