Runtime profiling

From HPC Wiki
Jump to navigation Jump to search

Introduction

The initial task in any performance analysis is to figure out in which parts of the code the runtime is spent. One wants to focus for optimisation on those regions of the code to achieve an overall speedup of the code. The tool helping to get an overview of where the time is spent is called a runtime profiler. There exist roughly two flavours: Instrumentation based and sampling based profilers. Instrumentation based profilers insert function calls to measure the time at points in the program. Additional tasks may be performed as e.g. determining the function call stack. While it is possible to insert instrumentation calls on the binary level the common way is that the compiler adds instrumentation functions. The standard tool in this area is gprof and almost any compiler supports to instrument the code for gprof. Statistical sampling based profiling on the other hand are based on probing of the programs call stack triggered by operating system interrupts at regular intervals. A widespread tool for sampling based profiling is the perf tool which builds on the builtin profiling infrastructure in recent Linux kernels. Both approaches have advantages and disadvantages: Instrumentation produces more accurate results but introduces more overhead and sampling has less overhead but produce less accurate results.

How to use gprof

How to use perf