Programming Parallel Computers 2020

Help with debugging

perf quick start

Here we use a CP3b implementation as an example. To get a number of useful statistics related to the performance of your implementation, try to run e.g. this command:

perf stat -d ./cp-benchmark 6000 6000 10

See the perf manual for more information on the usage. To get more meaningful results, it is often a good idea to switch off hyper-threading. Then, for example, the number of instructions per cycle per thread is much easier to interpret.

In the perf output, these numbers are often very helpful:

perf record & report

To identify the most critical parts of your code, you can also try e.g.:

perf record ./cp-benchmark 6000 6000 10

This will create a data file that you can then study with a simple text-based user interface:

perf report

Select the relevant function, select “Annotate”, and it should take you directly to the most performance-critical part in the assembly code.