You are here: Home Resources Software Installed Software Tools Brief Introduction to Profiling

Brief Introduction to Profiling

A very brief introduction to profiling and optimization of parallel (MPI) codes.

Rule one of code optimization

Rule one of code optimization says "Don't do it yourself unless you have to". In other words, use existing libraries (BLAS, LAPACK, ARPACK, FFTW, netCDF, METIS, SCALAPACK etc.) whenever possible. If no suitable package exist, we recommend the following for optimizing your code.

Procedure

  1. Optimize a serial version of the code. Use these tools:
    • Profilers (gprof etc.) to find the bottlenecks of your code.
    • Performance counters to find out the performance (GFlop/s) of your code. You can also get cache hit rates etc. to investigate why performance is poor.
    Optimization is an iterative process. Use the tools mentioned above in order to improve the efficiency of your code. Remember to check the correctness of your code after each improvement.

    You may also experiment with different compiler optimization options.

  2. Use a trace tool (Vampir, jumpshot) or mpiP which is a lightweight profiling library for MPI applications to evaluate your parallel performance.

Further information

See the PDC software page for a list of available tools. gprof is available on all PDC computers. (This refers to older versions of the compiler: However, it works poorly on Lucidor, we recommend using hpcprof from HPCToolkit instead.)

On Lucidor and Lenngren you can do

module show perftools
to get a list of installed and recommended performance tools. They include papiex, HPCtoolkit, mpiP and Jumpshot.

A good book on the subject is:
High Performance Computing, 2nd edition; by Kevin Dowd & Charles Severance; O'Reilly, 1998.

You should also consider attending the PDC Introduction to High-Performce Computing summer school, to learn more about writing efficient HPC codes.

Technical details

How to use the tools mentioned above differs from computer to computer. However, here are some general guidelines.
  • The profiler gprof:
    1. Compile with function profiling turned on. This option is usually -pg. (On the Intel compilers it is -qp.)
    2. Execute the code. A file called gmon.out will appear.
    3. Create a profile and redirect the result into a file with:
      gprof ./myprog gmon.out > gprof.txt
  • Performance counters. Unless papi is installed on the computer in question, you need to figure out the computer specific way to do this. If papiex is installed, you can measure GFlop/s with
    papiex -e PAPI_FP_OPS -e PAPI_TOT_CYC -- ./myprog
    
  • Trace tools and mpiP. Typically you need to relink your MPI code to instrument it with wrappers around the MPI calls. With jumpshot you link with -mpilog.
When using these tools on a PDC computer, please remember to consult the computer specific pages by following links from the software page.