You are here: Home Resources Computers Lindgren How to Job Submission

Job Submission

Queuing system

On Lindgren we use moab as scheduler and torque as resource manager (often called batch-system). You should get the proper modules to torque loaded at log in.

Commands

  • Documentation
    man pbs
    man qstat
    and similar for other commands.
  • Submit a job
  • qsub ./jobscript.pbs
  • Remove a job from queue (not possible when job is running, see next point)
  • qdel jobid
  • Remove a running job with "Apid" 126810 (retrieve apid from apstat)
  • apkill 1268102
  • Information about the jobs running and in the queue
  • qstat -u userName, qstat -Qf
    showq

    These commands give information on the current status of the queue and basic information on the jobs in the queue.

    showbf

    This command shows the current free nodes

    xtnodestat
    apstat

    These are Cray specific commands that show how many nodes each job is using and where each job is running within the Cray

    checkjob <job id number>

    This will give detailed information on a job, which time allocation it is charged to and so on.

Queue priority

Learn more about how jobs are prioritized at Lindgren and the projinfo command

that shows how much time you have used of your time allocations on Lindgren.

Running interactively

There are eight compute nodes available for interactive use. The login node should NEVER be used to run interactive jobs, always run them on the interactive nodes.

Be advised that you should only run processes of maximum one hour in duration on these interactive nodes! For longer jobs, please use the dedicated nodes.

Note mpi jobs will fail if you try to run them on the login node with an error like

Fatal error in MPI_Init: Other MPI error, error stack:
MPIR_Init_thread(408): Initialization failed
MPID_Init(123).......: channel initialization failed
MPID_Init(461).......:  PMI2 init failed: 1

To launch a job on an interactive node use:

aprun

The aprun command accepts the -t option which limits the amount of time your job will run for e.g.

aprun -n 24 -t 3600 program.x

which will limit the program to running for 1 hour of CPU time.

Try the command:

xtnodestat | grep inter

to find the number of available interactive nodes.

The number of free nodes is shown

Available compute nodes:          6 interactive,         52 batch

If there are not enough free interactive nodes to fulfill your request then your job will be held in a simple queue. The aprun command will only finish when the job has completed. To see the status of your interactive job you can use the command

apstat

Running on dedicated nodes

Most jobs should be run on dedicated nodes through the queue system. Jobs must be run from somewhere within Lustre, in general:

/cfs/klemming/scratch/y/yourUsername

Also remember that the wallclock limit for jobs is 24 hours on Lindgren and that jobs that are less than one hour long have priority.

Dedicated nodes are accessed through the queue system with the command:

qsub ./jobscript.pbs

where jobscript.pbs is a PBS script as descibed below.

Job scripts

In a job script several PBS environment variables can be defined:

  • #PBS -N job_name - the job name is used to determine the name of job output and error files
  • #PBS -l walltime=hh:mm:ss - maximum job elapsed time should be indicated whenever possible: this allows PBS to determine best scheduling startegy. Current maximum is 24 hours.
  • #PBS -l mppwidth=n - Number of processes (MPI rank) that will be spawned for the given job. Should always be set to a multiple of 24 (number or cores on one node).
  • #PBS -e error_file.e - job error file
  • #PBS -o output_file.o - job output file

Note the #PBS -M name@machine.name which should email you when the job starts and finishes is NOT currently enabled so will not work.

Example 1

An example of a job script for a MPI program (using mppwidth parameter to define the number of MPI processes).

    # The name of the script is myjob
    #PBS -N myjob
    
    # Only 1 hour wall-clock time will be given to this job
    #PBS -l walltime=1:00:00
    
    # Number of cores to be allocated is 288.
    # always ask for complete nodes (i.e. mppwidth should normally
    # be a multiple of 24)
    #PBS -l mppwidth=288
    
    #PBS -e error_file.e
    #PBS -o output_file.o
    
    # Change to the work directory
    cd $PBS_O_WORKDIR
    
    # Run the executable named myexe 
    # and write the output into my_output_file
    aprun -n 288 ./myexe > my_output_file 2>&1

Example 2

An example for a Hybrid MPI+OpenMP program (using mpp* parameters to define the number of MPI processes). This example will pack 24 processes on each compute node.

    # The name of the script is myjob
    #PBS -N myjob
    
    # Only 1 hour wall-clock time will be given to this job
    #PBS -l walltime=1:00:00
    
    # Number of MPI tasks.
    # always ask for complete nodes 
    # in this case as we have 4 tasks per node mppwidth should
    # be a multiple of 4
    #PBS -l mppwidth=1024
    
    # Number of MPI tasks per node
    #PBS -l mppnppn=4
    
    # Number of cores hosting OpenMP threads
    #PBS -l mppdepth=6
    
    #PBS -e error_file.e
    #PBS -o output_file.o
    
    # Change to the work directory
    cd $PBS_O_WORKDIR
    export OMP_NUM_THREADS=6
    
    # Run the executable named myexe 
    # and write the output into my_output_file
    aprun -n 1024 -N 4 -d 6 ./myexe > my_output_file 2>&1

Example 3

If you want to run an application such as gromacs or other code for which there is a module you need to add an extra line at the start of the job script

    # The name of the script is myjob
    #PBS -N myjob
    
    # Only 1 hour wall-clock time will be given to this job
    #PBS -l walltime=1:00:00
    
    # Number of MPI tasks.
    # always ask for complete nodes (i.e. mppwidth should normally
    # be a multiple of 24 )
    #PBS -l mppwidth=960
    
    #PBS -e error_file.e
    #PBS -o output_file.o
    
    # Change to the work directory
    cd $PBS_O_WORKDIR
    #enable modules within the batch system
    . /opt/modules/default/etc/modules.sh
    
    #load the gromacs module
    module add gromacs
    
    # Run and write the output into my_output_file
    aprun -n 960 mdrun -s job.tpr > my_output_file