Software Development¶
Compilers and libraries on Beskow¶
This cluster uses compiler wrappers so that the source code is compiled with the compiler module currently loaded. Note that the login node and compute nodes of Beskow have different Intel processors – Xeon Sandy Bridge processors on login node and Xeon Haswell/Broadwell processors on compute nodes. When compiling on Beskow, you should always use the wrapper ftn/cc/CC
irrespective of your choice of compiler. These scripts automatically call the correct compiler (depending which PrgEnv module you have loaded). They also automatically link to the MPI libraries, and and other libraries provided by Cray (e.g. to link to Lapack etc, just load the cray-libsci module and the scripts will handle the rest). Some installation scripts do not detect these wrappers and in this case it is good practice to set the following environment variables
export CXX=CC
export CC=cc
export FC=ftn
prior of executing the installation script. One thing to keep in mind is that the Cray libraris have been optimized for the compute nodes and the compiled executables may not be able to run on the login node.
Commands¶
Command for using these compiler wrapper are as follows…
Language |
Command |
---|---|
C |
cc [flags] source.c |
C++ |
CC [flags] source.cpp |
Fortran |
ftn [flags] source.f90 |
Type of compilers and versions¶
By default the cray compiler is loaded into your environment. In order to use another compiler you have to swap compiler modules:
module swap PrgEnv-cray PrgEnv-other
Compiler |
Module |
---|---|
Cray |
PrgEnv-cray |
Intel |
PrgEnv-intel |
GNU |
PrgEnv-gnu |
There are a few of versions available for each compiler. PDC recommends to use the latest version of compiler and the Cray Developer Toolkit (cdt), for example,:
# check available intel versions
module avail intel
# using intel v18.0.0.128
module swap intel intel/18.0.0.128
# using cdt v17.10
module add cdt/17.10
Libraries¶
By using the compiler wrappers cc
, CC
or ftn
, MPI libraries and linear algebra libraries like BLAS, LAPACK, SCALAPACK are automatically linked. The linear algebra libraries are included in the module cray-libsci
, which is automatically loaded together with the PrgEnv-cray/intel/gnu
module. Additional libraries like fftw
can be loaded into your environment via modules.
Example:
module load fftw
See https://www.pdc.kth.se/software#libraries For more libraries that can be added.
Other libraries that can be linked by just loading the module and by not adding any extra flags include
cray-netcdf cray-hdf5 cray-petsc ...
i.e., the name of modules with cray- provided by Cray.
Compiling dynamically¶
By default on the Cray all programs are built using static libraries. This is recommended for most programs. Some programs however it is necessary to build using dynamic libraries. In this case the following procedure should be used.
Set the following environment variable
export CRAYPE_LINK_TYPE=dynamic
then compile your code as normal. At runtime you also need to make sure you have access to the shared libraries.
export CRAY_ROOTFS=DSL
Linking to hugepages¶
When compiling your own code, it is generally recommended to link to the hugepages library in order to increase the amount of physical memory mapped within one virtual page. For many programs this can increase the performance since the pressure on the TLB cache is reduced. Furthermore, it can reduce network congestion on the cluster when multiple programs are running simultaneously and sharing communication bandwidth. Fortunately, adding hugepages to statically linked programs is simple. At compile time, one needs to load a hugepages module by, e.g.,
module load craype-hugepages16M
This will automatically link the hugepages library to your program (the size does not matter for linking). Then, at run time (in your submit script),
module load craype-hugepages<size>
The parameter <size> defines the pagesize you want to use. Available sizes range from 2MB to 512MB. It is recommended to perform short timing tests to see which size of hugepages maximizes the performance.
Compiling Large Programs using Make¶
The login node on Beskow is quite weak with only 16 cores and 32GB of ram so large programs can also be compiled on the interactive nodes. To be able to access make on the interactive nodes you first need to set the environment variable CRAY_ROOTFS
export CRAY_ROOTFS=DSL
Do remember to have all files assosciated to your software in the Lustre file system and could be useful to define a scratch area
export TMPDIR=/cfs/klemming/scratch/<1st letter username>/<username>
Then book an interactive node for your compilation
salloc -n 32 - t 1:00:00
The node can then be used to compile your code
aprun -n 1 -d 32 make -j 32
Examples¶
Compiling serial and MPI code:
# Fortran
ftn [flags] source.f90
# C
cc [flags] source.c
# C++
CC [flags] source.cpp
Compiling OpenMP code:
# Intel
ftn -openmp source.f90
cc -openmp source.c
CC -openmp source.cpp
# Cray
ftn -openmp source.f90
cc -openmp source.c
CC -openmp source.cpp
# GNU
ftn -fopenmp source.f90
cc -fopenmp source.c
CC -fopenmp source.cpp
Compilers and libraries on Tegner¶
This cluster does not use compiler wrappers so you must load the appropriate modules and link to libraries yourself. many versions of the different compilers do exist and the command does differ depending on language and type of compiler.
Examples¶
Compiling serial code:
# GNU
gfortran -o hello hello.f
gcc -o hello hello.c
g++ -o hello hello.cpp
# Intel
module add i-compilers
ifort -FR -o hello hello.f
icc -o hello hello.c
icpc -o hello hello.cpp
# Portland
module add pgi
pgf90 -fast -o hello hello.f
pgcc -fast -o hello hello.c
pgc++ -fast -o hello hello.cpp
Compiling OpenMP/MPI code:
# GNU+OpenMPI
module add gcc/5.1 openmpi/1.8-gcc-5.1
mpif90 -FR -fopenmp -o hello_mpi hello_mpi.f
mpicc -fopenmp -o hello_mpi hello_mpi.c
mpic++ -fopenmp -o hello_mpi hello_mpi.cpp
# Intel+IntelMPI
module add i-compilers intelmpi
mpiifort -openmp -o hello.f90 -o hello_mpi
mpiicc -openmp -o hello_mpi hello_mpi.c
mpiicpc -openmp -o hello_mpi hello_mpi.cpp
# Portland
module add pgi
pgf90 -mp -fast -o hello hello.f
pgcc -mp -fast -o hello hello.c
pgc++ -mp -fast -o hello hello.cpp
Compiling CUDA code:
# CUDA
module add cuda/8.0
nvcc -arch=sm_37 -O2 hello.cu -o hello.x
Allinea Forge¶
Allinea tools can be used for debugging and performance analysis. More information at https://www.pdc.kth.se/software/software/allinea-forge/index_general.html
Downloadable example for compiling and submitting¶
This is a simple example showing how you can compile and submit parallel jobs on our clusters. For instructions please read the ïncluded README file. The example can be donwloaded from https://github.com/PDC-support/introduction-to-pdc/tree/master/example