Difference between revisions of "Gprof Tutorial"
m |
|||
(18 intermediate revisions by 4 users not shown) | |||
Line 1: | Line 1: | ||
− | [[Category:Tutorials]] [[Category: | + | [[Category:Tutorials]] [[Category:HPC-Developer]]<nowiki /> |
− | [[Category:Tutorials | | + | [[Category:Tutorials | Gprof Tutorial]]<nowiki /> |
− | + | ||
− | Gprof | + | This tutorial deals with the topic of applications performance analysis with the GNU profiler Gprof. Profiling applications gives valuable insights into the program structure and exposes performance bottlenecks, which point to sections of the code where optimizations are most effective. |
+ | |||
+ | The tutorial covers all necessary basics to get started with Gprof: it shows how to instrument applications, how to generate performance information for an application run and how to evaluate the results. In addition, it explains how to visualize the application structure using call graphs and how to annotate the application's source code with runtime information. Three real-world examples from the areas of biology, computer science and mechanical engineering demonstrate that this works with different programming languages (C/C++, Fortran), different compilers (GNU, Intel) and even parallel applications (threads, MPI). | ||
__TOC__ | __TOC__ | ||
− | <youtube width="600" height="400" right> | + | <youtube width="600" height="400" right>F8evu-ybDfY</youtube> |
− | + | ([[Media:HPC.NRW_gprof_Tutorial.pdf | Slides as pdf]]) | |
'''Quiz''' | '''Quiz''' | ||
{{hidden begin | {{hidden begin | ||
− | |title = 1. What | + | |title = 1. What languages can Gprof profile? |
}} | }} | ||
<quiz display=simple> | <quiz display=simple> | ||
Line 22: | Line 24: | ||
+ C/C++, Fortran, Pascal | + C/C++, Fortran, Pascal | ||
|| True | || True | ||
− | - Haskell, Cobol, | + | - Haskell, Cobol, Whitespace |
|| | || | ||
</quiz> | </quiz> | ||
Line 28: | Line 30: | ||
{{hidden begin | {{hidden begin | ||
− | |title = 2. How does | + | |title = 2. How does Gprof generate a performance profile of an application? |
}} | }} | ||
<quiz display=simple> | <quiz display=simple> | ||
{ | { | ||
|type="()"} | |type="()"} | ||
− | + | + | + by instrumenting the application during compilation |
|| True | || True | ||
− | - | + | - through static analysis of the source code |
+ | || | ||
+ | - by means of hardware performance counters | ||
|| | || | ||
</quiz> | </quiz> | ||
Line 41: | Line 45: | ||
{{hidden begin | {{hidden begin | ||
− | |title = 3. What compiler flag is used to instrument the application? | + | |title = 3. What compiler/linker flag is used to instrument the application? |
}} | }} | ||
<quiz display=simple> | <quiz display=simple> | ||
{ | { | ||
|type="()"} | |type="()"} | ||
− | + | + | + <code>-pg</code> |
|| True | || True | ||
− | - | + | - <code>-pig</code> |
|| | || | ||
− | - | + | - <code>--profile</code> |
|| | || | ||
</quiz> | </quiz> | ||
Line 56: | Line 60: | ||
{{hidden begin | {{hidden begin | ||
− | |title = | + | |title = 4. Which compilers support Gprof? |
}} | }} | ||
<quiz display=simple> | <quiz display=simple> | ||
{ | { | ||
|type="()"} | |type="()"} | ||
− | - | + | - only the commercial Intel compilers |
|| | || | ||
− | + | + | - only open-source compilers (e.g., from GNU) |
+ | || | ||
+ | + many different compilers (e.g., from GNU and Intel) | ||
|| True | || True | ||
− | - | + | </quiz> |
+ | {{hidden end}} | ||
+ | |||
+ | {{hidden begin | ||
+ | |title = 5. How should the input parameters be when running the instrumented application? | ||
+ | }} | ||
+ | <quiz display=simple> | ||
+ | { | ||
+ | |type="()"} | ||
+ | - simple and understandable | ||
+ | || | ||
+ | + representative of a usual workload | ||
+ | || True | ||
+ | - covering edge cases | ||
+ | || | ||
+ | </quiz> | ||
+ | {{hidden end}} | ||
+ | |||
+ | {{hidden begin | ||
+ | |title = 6. Which applications can be analyzed with Gprof? | ||
+ | }} | ||
+ | <quiz display=simple> | ||
+ | { | ||
+ | |type="()"} | ||
+ | - only small examples | ||
+ | || | ||
+ | - up to medium-sized applications with a running time below ~1 hour | ||
|| | || | ||
+ | + all, even large real-world examples with huge running times | ||
+ | || True | ||
</quiz> | </quiz> | ||
{{hidden end}} | {{hidden end}} | ||
+ | {{hidden begin | ||
+ | |title = 7. What is a call graph? | ||
+ | }} | ||
+ | <quiz display=simple> | ||
+ | { | ||
+ | |type="()"} | ||
+ | - an android app to show incoming callers | ||
+ | || | ||
+ | + a hierarchy diagram of function calls in a given profile | ||
+ | || True | ||
+ | - instructions of how to call for help during emergencies | ||
+ | || | ||
+ | </quiz> | ||
+ | {{hidden end}} | ||
+ | |||
+ | {{hidden begin | ||
+ | |title = 8. How do you generate a call graph of a Gprof profile? | ||
+ | }} | ||
+ | <quiz display=simple> | ||
+ | { | ||
+ | |type="()"} | ||
+ | - gprof --call-graph | ||
+ | || | ||
+ | - gprof-call-graph | ||
+ | || | ||
+ | + gprof --graph | ||
+ | || True | ||
+ | </quiz> | ||
+ | {{hidden end}} | ||
− | === | + | {{hidden begin |
+ | |title = 9. What is gprof2dot? | ||
+ | }} | ||
+ | <quiz display=simple> | ||
+ | { | ||
+ | |type="()"} | ||
+ | + third-party script for call graph visualization via the "dot" library | ||
+ | || True | ||
+ | - Gprof feature to export profiles as a pdf | ||
+ | || | ||
+ | - a fork of the beta version of Gprof2 | ||
+ | || | ||
+ | </quiz> | ||
+ | {{hidden end}} | ||
− | + | {{hidden begin | |
+ | |title = 10. Does Gprof work with parallel applications? | ||
+ | }} | ||
+ | <quiz display=simple> | ||
+ | { | ||
+ | |type="()"} | ||
+ | - no, Gprof only works with sequential applications | ||
+ | || | ||
+ | + yes, but Gprof cannot differentiate between individual threads or processes | ||
+ | || True | ||
+ | - yes, parallel profiling is the main use-case of Gprof | ||
+ | || | ||
+ | </quiz> | ||
+ | {{hidden end}} | ||
− | < | + | {{hidden begin |
+ | |title = 11. How much runtime overhead does Gprof produce? | ||
+ | }} | ||
+ | <quiz display=simple> | ||
+ | { | ||
+ | |type="()"} | ||
+ | - none | ||
+ | || | ||
+ | + little | ||
+ | || True | ||
+ | - much | ||
+ | || | ||
+ | </quiz> | ||
+ | {{hidden end}} |
Latest revision as of 17:48, 3 December 2020
This tutorial deals with the topic of applications performance analysis with the GNU profiler Gprof. Profiling applications gives valuable insights into the program structure and exposes performance bottlenecks, which point to sections of the code where optimizations are most effective.
The tutorial covers all necessary basics to get started with Gprof: it shows how to instrument applications, how to generate performance information for an application run and how to evaluate the results. In addition, it explains how to visualize the application structure using call graphs and how to annotate the application's source code with runtime information. Three real-world examples from the areas of biology, computer science and mechanical engineering demonstrate that this works with different programming languages (C/C++, Fortran), different compilers (GNU, Intel) and even parallel applications (threads, MPI).
Quiz