MPI Performance Tools
Measuring parallel performance
It is surprisingly hard to predict how a parallel program will behave given a certain data or parameters. Therefore, instead of only relying on the source code for understanding the program behavior, it is often useful to investigate the real parallel program execution in some way. There are three main tools for this:
- Profilers record the amount of time spent in different parts of a program. A profile is often generated using some sampling technique.
- Counters records the frequency of different events in a code.
- Event traces records every occurrence of certain events.
All parallel performance data is inherently multidimensional; execution times, communication costs on several processors for different problem sizes and much more can be of use to know. It is easily understood that raw data generated by profilers, counters and event traces seldom is humanly readable and has to be reduced in order to be possible to visualize. The amount of data generated especially from event traces can be huge, where as a profile in this sense is a more limited performance tool.
Paraver
Paraver is a parallel program visualization and analysis tool. The information provided by Paraver can be used to decide if a program needs to be improved and, in such cases, where in the code the main effort should be but in. Paraver uses a trace file with the file ending .prv. This file has to be generated after running your MPI program. The trace is created using extrae, which is an instrumentation package which uses the LD_PRELOAD mechanism to intercept calls to the MPI library. The traces contain events on entry and exit to the MPI calls (and user calls) but also hardware counter events (cache misses, instructions and cycles) at these points can be recorded. A .pcf-file with symbolic information for the numerically encoded records in the trace is also created along side the .prv-file. What performance data is recorded by extrae is governed by an .xml-file. When you have run your program using extrae a file called TRACE.mpits and a folder called set-0 will have been created. In the directory where these are located you can execute:
mpi2prv -f TRACE.mpits -o yourTrace.prv
to generate the file yourTrace.prv, which can be read with Paraver.
The code
The program you will be working with is a naive code for multiplying two matrices. The program is parallelised with MPI. The program is on purpose written and compiled so that it will give bad performance. Using the Paraver tool you should be able to suggest some changes that would improve the program's behaviour. Do not cheat by reading the source code! Instead, rely in the force to give you the insight (and for today, the force, will be the performance tool).
Assignments
Run with extrae
Download the MPI program MPImatrixmatrix.c and the Makefile. Then compile the program and submit the script run.sh.
-
Login at
summer-cray.pdc.kth.seaccording to the general instructions. - Get the proper environment for doing the lab using the
command:
module add extrae papi - Copy the files Makefile and MPImatrixmatrix.c or MPImatrixmatrix.f to the parallel file system.
- Compile the code using the Makefile with the command:
make c
ormake fortran - Submit your job from the parallel file system using:
aprun -n 16 ./MPImatrixmatrix_c
oraprun -n 16 ./MPImatrixmatrix_fortran - Convert the created TRACE.mpits:
mpi2prv -f TRACE.mpits -o MPImatrixmatrix.prv - Copy the entire folder to your AFS home directory.
Start Paraver on your local computer
When you have put the generated trace in your AFS home directory you can look at using Paraver on your local computer.
- On your local machine: Go to the directory in your AFS home directory in which you put the trace that you generated on summer-cray.
- Setup your environment to use Paraver:
module add paraver
- Run Paraver:
wxparaver
- If successful, you should then get the graphical main window for Paraver. Open your Paraver trace by selecting File, Load Trace. Browse to find your MPImatrixmatrix.prv file.
Analyse the program
Now, using Paraver, try to answer the following questions:
- Does the trace look the same every time you run the program?
- Which MPI calls use the most time?
- Does the profile of your code indicate any performance problem?
- What is the communication pattern of the code? For instance look at:
- Who sends messages to whom?
- Are all the nodes used efficiently? If not, why?
- What do you think would happen if this code was run on more nodes?
- Can you think of another way to send data between the nodes?
- Modify the program to improve on the issues that you have found and analyze the your result with Paraver.
Just as important as answering these questions is for you to try to understand what the graphs that Paraver give you actually show. Please play around with Paraver and if you want to you can also try running it on other MPI-codes you have. Some step-by-step instructions on how to use Paraver is found below.
How to display data in Paraver
Single Timeline Window
In the Paraver main window you can press the button "New single timeline Window" (if you keep the mouse still over the button, the text will show) to open a new window that displays a timeline and the MPI processes that have run.
- Doule click on one of the threads.
- Click on the color tab to see what the colours mean.
- Zoom in by drawing a square in the picture while holding down the left mouse button.
- You can zoom in on only a selection of the processes by holding down the Ctrl-key while zooming.
- By right clicking in the window you can get more options, for instance how to zoom out.
Paraver windows and configuration files
You can have several windows open at once in Paraver. What you will see in each is decided by configuration files that you can load (and by which data you have generated when you ran your program). If you are particularly happy with some setting you made in a window, you can create and save your own configuration files. This is available by choosing File, Save Configuration or File, Load Configuration, respectively.
In:
/afs/pdc.kth.se/pdc/vol/paraver/3.99/PDCparaver/cfgs/
several configurations are already available for you to use. You can use these from above path or you can copy them to your own directory if you prefer.
Instantaneous parallelism
From the main window, load the configuration file instantaneous_parallelism.cfg:
- Choose File, Load Configuration
- Browse for the file:
/afs/pdc.kth.se/pdc/vol/paraver/3.99/PDCparaver/cfgs/General/views/instantaneous_parallelism.cfg
- In the new window that appears, click on the line to see details (for instance how many threads are run)
In this window you can see the number of processes doing computations at any point in time.
Useful duration
From the main window, load the configuration file useful_duration.cfg:
- Choose File, Load Configuration
- Browse for the file:
/afs/pdc.kth.se/pdc/vol/paraver/3.99/PDCparaver/cfgs/General/views/useful_duration.cfg
- Right click on a thread, choose Fit Semantic Scale, Fit Both. The color in the graph will then represent the useful time for each thread.
In this window you can see a timeline where the color represents the duration of a computation burst between an exit from MPI and the next entry. The function is valued to 0 (black) in the regions where the processes are in MPI. This view gives a good perception of where the major computational phases are, and their balance across processors.
Focusing on time sections
- Go back to the window instantaneous_parallelism and focus (zoom in) on some part you find interesting.
- Right click and select copy.
- Now, go back to the single timeline window (i.e. the first window you opened, possibly named "New window")
- In this window, right click and select
- Paste, Time
MPI calls
From the main window, load the configuration file mpi_call.cfg:
- Choose File, Load Configuration
- Browse for the file:
/afs/pdc.kth.se/pdc/vol/paraver/3.99/PDCparaver/cfgs/mpi/views/MPI_call.cfg
The window displays a timeline of which MPI call have been executed at each point in time by each process. The horizontal axis represents time, from the start of the application. For every thread, the colors represent an MPI call, or black when doing user level computation outside of MPI.
MPI call profile
From the main window, load the configuration file mpi_stats.cfg:
- Choose File, Load Configuration
- Browse for the file:
/afs/pdc.kth.se/pdc/vol/paraver/3.99/PDCparaver/cfgs/mpi/analysis/mpi_stats.cfg
The table shows one row per process and one column per MPI call. The first column corresponds to the time spent outside MPI. Each entry in the table states the percentage of time the corresponding thread has been inside the specific call.
Useful links
- The Slides from the lecture on Parallel Performance
- The Paraver web page
- The TAU web page
- The MpiP web page


