Software Development

Compilers and libraries on Beskow

This cluster uses compiler wrappers so that the source code is compiled with the compiler module currently loaded. Note that the login node and compute nodes of Beskow have different Intel processors – Xeon Sandy Bridge processors on login node and Xeon Haswell/Broadwell processors on compute nodes. When compiling on Beskow, you should always use the wrapper ftn/cc/CC irrespective of your choice of compiler. These scripts automatically call the correct compiler (depending which PrgEnv module you have loaded). They also automatically link to the MPI libraries, and and other libraries provided by Cray (e.g. to link to Lapack etc, just load the cray-libsci module and the scripts will handle the rest). Some installation scripts do not detect these wrappers and in this case it is good practice to set the following environment variables

export CXX=CC
export CC=cc
export FC=ftn

prior of executing the installation script. One thing to keep in mind is that the Cray libraris have been optimized for the compute nodes and the compiled executables may not be able to run on the login node.

Commands

Command for using these compiler wrapper are as follows…

Language

Command

C

cc [flags] source.c

C++

CC [flags] source.cpp

Fortran

ftn [flags] source.f90

Type of compilers and versions

By default the cray compiler is loaded into your environment. In order to use another compiler you have to swap compiler modules:

module swap PrgEnv-cray PrgEnv-other

Compiler

Module

Cray

PrgEnv-cray

Intel

PrgEnv-intel

GNU

PrgEnv-gnu

There are a few of versions available for each compiler. PDC recommends to use the latest version of compiler and the Cray Developer Toolkit (cdt), for example,:

# check available intel versions
module avail intel
# using intel v18.0.0.128
module swap intel intel/18.0.0.128
# using cdt v17.10
module add cdt/17.10

Libraries

By using the compiler wrappers cc, CC or ftn, MPI libraries and linear algebra libraries like BLAS, LAPACK, SCALAPACK are automatically linked. The linear algebra libraries are included in the module cray-libsci, which is automatically loaded together with the PrgEnv-cray/intel/gnu module. Additional libraries like fftw can be loaded into your environment via modules.

Example:

module load fftw

See https://www.pdc.kth.se/software#libraries For more libraries that can be added.

Other libraries that can be linked by just loading the module and by not adding any extra flags include

cray-netcdf cray-hdf5 cray-petsc ...

i.e., the name of modules with cray- provided by Cray.

Compiling dynamically

By default on the Cray all programs are built using static libraries. This is recommended for most programs. Some programs however it is necessary to build using dynamic libraries. In this case the following procedure should be used.

Set the following environment variable

export CRAYPE_LINK_TYPE=dynamic

then compile your code as normal. At runtime you also need to make sure you have access to the shared libraries.

export CRAY_ROOTFS=DSL

Linking to hugepages

When compiling your own code, it is generally recommended to link to the hugepages library in order to increase the amount of physical memory mapped within one virtual page. For many programs this can increase the performance since the pressure on the TLB cache is reduced. Furthermore, it can reduce network congestion on the cluster when multiple programs are running simultaneously and sharing communication bandwidth. Fortunately, adding hugepages to statically linked programs is simple. At compile time, one needs to load a hugepages module by, e.g.,

module load craype-hugepages16M

This will automatically link the hugepages library to your program (the size does not matter for linking). Then, at run time (in your submit script),

module load craype-hugepages<size>

The parameter <size> defines the pagesize you want to use. Available sizes range from 2MB to 512MB. It is recommended to perform short timing tests to see which size of hugepages maximizes the performance.

Compiling Large Programs using Make

The login node on Beskow is quite weak with only 16 cores and 32GB of ram so large programs can also be compiled on the interactive nodes. To be able to access make on the interactive nodes you first need to set the environment variable CRAY_ROOTFS

export CRAY_ROOTFS=DSL

Do remember to have all files assosciated to your software in the Lustre file system and could be useful to define a scratch area

export TMPDIR=/cfs/klemming/scratch/<1st letter username>/<username>

Then book an interactive node for your compilation

salloc -n 32 - t 1:00:00

The node can then be used to compile your code

aprun -n 1 -d 32 make -j 32

Examples

Compiling serial and MPI code:

# Fortran
ftn [flags] source.f90
# C
cc [flags] source.c
# C++
CC [flags] source.cpp

Compiling OpenMP code:

# Intel
ftn -openmp source.f90
cc -openmp source.c
CC -openmp source.cpp
# Cray
ftn -openmp source.f90
cc -openmp source.c
CC -openmp source.cpp
# GNU
ftn -fopenmp source.f90
cc -fopenmp source.c
CC -fopenmp source.cpp

Compilers and libraries on Tegner

This cluster does not use compiler wrappers so you must load the appropriate modules and link to libraries yourself. many versions of the different compilers do exist and the command does differ depending on language and type of compiler.

Examples

Compiling serial code:

# GNU
gfortran -o hello hello.f
gcc -o hello hello.c
g++ -o hello hello.cpp
# Intel
module add i-compilers
ifort -FR -o hello hello.f
icc -o hello hello.c
icpc -o hello hello.cpp
# Portland
module add pgi
pgf90 -fast -o hello hello.f
pgcc -fast -o hello hello.c
pgc++ -fast -o hello hello.cpp

Compiling OpenMP/MPI code:

# GNU+OpenMPI
module add gcc/5.1 openmpi/1.8-gcc-5.1
mpif90 -FR -fopenmp -o hello_mpi hello_mpi.f
mpicc -fopenmp -o hello_mpi hello_mpi.c
mpic++ -fopenmp -o hello_mpi hello_mpi.cpp
# Intel+IntelMPI
module add i-compilers intelmpi
mpiifort -openmp -o hello.f90 -o hello_mpi
mpiicc -openmp -o hello_mpi hello_mpi.c
mpiicpc  -openmp -o hello_mpi hello_mpi.cpp
# Portland
module add pgi
pgf90 -mp -fast -o hello hello.f
pgcc -mp -fast -o hello hello.c
pgc++ -mp -fast -o hello hello.cpp

Compiling CUDA code:

# CUDA
module add cuda/8.0
nvcc -arch=sm_37 -O2  hello.cu -o hello.x

Allinea Forge

Allinea tools can be used for debugging and performance analysis. More information at https://www.pdc.kth.se/software/software/allinea-forge/index_general.html

Downloadable example for compiling and submitting

This is a simple example showing how you can compile and submit parallel jobs on our clusters. For instructions please read the ïncluded README file. The example can be donwloaded from https://github.com/PDC-support/introduction-to-pdc/tree/master/example