After my post on HPCToolkit, I felt that I prefered QCacheGrind as a GUI to explore profiling results. So here is a gist with a Python script to convert XML HPCToolkit experiments to callgrind format: https://gist.github.com/mbrucher/6cad31e38beca770523b

For instance, this is a display of an Audio Toolkit test of L2 cache misses:

ATK L2 cache misses profile
ATK L2 cache misses profile

Enjoy!

After presenting Valgrind as an emulation profiler, I will present Microsoft solution, Visual Studio Performance Tool. It is available in the Team Suite editions, and offers a sampling- and an instrumentation-based profiler. Of course, it is embedded in Visual Studio IDE and accessible from a solution.

Read More

Profiling with Valgrind/Callgrind

1 Star2 Stars3 Stars4 Stars5 Stars (12 votes, average: 3.92 out of 5)
Loading...

Profiling comes in three different flaviors. The first is emulation, where a processor behavior is emulated, the second is sampling, where at regular intervals, the profiler samples the status of a program, and fianlly instrulentation, where the profiler gets information when a subroutine is called and when it returns. As with the Heisenberg uncertainty, profiling changes the exact behavior of your program. This is something you have to remember when analyzing a profile.

Valgrind is an Open Source emulation profiler. It is freely available on standard Linux platforms. As it is an emulation, it is far slower than the actual program. This means that the I/O are underestimated. The advantage is that you can have every detail on the memory behavior (cache misses for instance). Valgrind does not emulate all processors, but you can tweak it to approach your own one.

Read More