Difference between revisions of "Intel VTune Tutorial/Analysis Types"

From HPC Wiki
Intel VTune Tutorial/Analysis Types
Jump to navigation Jump to search
 
(3 intermediate revisions by 2 users not shown)
Line 10: Line 10:
 
=== Video === <!--T:5-->
 
=== Video === <!--T:5-->
  
<youtube width="600" height="340" right>5hQcd3MQpBw</youtube>
+
<youtube width="600" height="340" right>ghFn5IBzjrc</youtube>
  
 
([[Media:VTune_analysis_types_compressed.pdf |Slides as pdf]])
 
([[Media:VTune_analysis_types_compressed.pdf |Slides as pdf]])
Line 20: Line 20:
 
{
 
{
 
|type="()"}
 
|type="()"}
- A burnt spot on the CPU, caused by a badly fitted cooler.
+
- A burnt spot on the CPU, caused by a badly fitted cooler
 
|| Not correct. Please make sure to not melt your CPU.
 
|| Not correct. Please make sure to not melt your CPU.
- A code segment that is very inefficient.
+
- A code segment that is very inefficient
 
|| Not correct. A hotspot is a good candidate for optimization in programs that never were optimized, but it is not necessarily inefficient!
 
|| Not correct. A hotspot is a good candidate for optimization in programs that never were optimized, but it is not necessarily inefficient!
+ A code segment where the program spends most of its time.
+
+ A code segment where the program spends most of its time
 
|| Correct. This could be a single instruction, a function, or loop where the program spends a significant amount of time.
 
|| Correct. This could be a single instruction, a function, or loop where the program spends a significant amount of time.
 
</quiz>
 
</quiz>
Line 33: Line 33:
 
{
 
{
 
|type="()"}
 
|type="()"}
- Low level performance results that are close to the hardware ("bottom").
+
- Low level performance results that are close to the hardware ("bottom")
 
|| Not correct. The closest thing to this would be the "Event Counts" tab in a microarchitecture exploration analysis.
 
|| Not correct. The closest thing to this would be the "Event Counts" tab in a microarchitecture exploration analysis.
- Time spent in each code section, with the quickest sections at the top.
+
- Time spent in each code section, with the quickest sections at the top
 
|| Not correct. It does show the time spent in each code section, but the sorting is not fixed and can be changed.
 
|| Not correct. It does show the time spent in each code section, but the sorting is not fixed and can be changed.
+ List of code sections (functions, loops) with their attributed measurements.
+
+ List of code sections (functions, loops) with their attributed measurements
 
|| Correct. This is a great place to start looking for functions and loops to consider during optimization!
 
|| Correct. This is a great place to start looking for functions and loops to consider during optimization!
 
</quiz>
 
</quiz>
Line 46: Line 46:
 
{
 
{
 
|type="()"}
 
|type="()"}
+ A graphical presentation of the call stack on a timeline.
+
+ A graphical presentation of the call stack on a timeline
 
|| Correct. This presentation can quickly show long running code sections and the vertical structure of the call stack.
 
|| Correct. This presentation can quickly show long running code sections and the vertical structure of the call stack.
- A statistic of dead CPUs from ill fitted coolers.
+
- A statistic of dead CPUs from ill fitted coolers
 
|| Not correct. Please make sure to not melt your CPU.
 
|| Not correct. Please make sure to not melt your CPU.
- A directed graph of a hotspots call stack. Functions are nodes and weighted edges encode the execution time.
+
- A directed graph of a hotspots call stack. Functions are nodes and weighted edges encode the execution time
|| Not correct. This sounds like something you would want to use gprof2dot for, though!
+
|| Not correct. This sounds like something you would want to use [[Gprof Tutorial | gprof2dot]] for, though!
 
</quiz>
 
</quiz>
 
{{hidden end}}
 
{{hidden end}}
Line 59: Line 59:
 
{
 
{
 
|type="()"}
 
|type="()"}
+ Yes, by clicking on the function name in any of the tabs.
+
+ Yes, by clicking on the function name in any of the tabs
 
|| Correct.
 
|| Correct.
- Yes, by calling <source enclose="none">vtune /path/to/source.cxx</source> from the commandline.
+
- Yes, by calling <source enclose="none">vtune /path/to/source.cxx</source> from the commandline
|| Not correct. The
+
|| Not correct.
- No, VTune only analyses binaries.
+
- No, VTune only analyses binaries
 
|| Not correct. The association between binaries and the source code is possible and very useful.
 
|| Not correct. The association between binaries and the source code is possible and very useful.
 
</quiz>
 
</quiz>
Line 72: Line 72:
 
{
 
{
 
|type="()"}
 
|type="()"}
- All analysis types are exactly the same.
+
- All analysis types are exactly the same
 
|| Of course not!
 
|| Of course not!
- There is exactly one analysis type for each high level metric (Memory, Back-end, Front-end, etc.).
+
- There is exactly one analysis type for each high level metric (Memory, Back-end, Front-end, etc.)
 
|| Not correct. Analysis types often share the collected metrics, but the presentation is different.
 
|| Not correct. Analysis types often share the collected metrics, but the presentation is different.
+ Analysis types may collect similar data, but the presentation focusses on a certain topic.
+
+ Analysis types may collect similar data, but the presentation focuses on a certain topic
 
|| Correct. Expect some overlap between the types.
 
|| Correct. Expect some overlap between the types.
 
</quiz>
 
</quiz>
 
{{hidden end}}
 
{{hidden end}}

Latest revision as of 12:09, 15 July 2022

Tutorial
Title: Intel VTune Tutorial
Provider: HPC.NRW

Contact: tutorials@hpc.nrw
Type: Multi-part video
Topic Area: Performance analysis
License: CC-BY-SA
Syllabus

1. Introduction
2. CPU Architecture
3. Analysis Types
4. Useful Tips

The third Intel VTune tutorial covers a couple of important analysis types and shows their results. The hotspots analysis is discussed in detail and can tell you where your application spends most of its time. You can go into more detail with the threading analysis, microarchitecture exploration, or HPC performance characterization, each focussing of a specific topic.

Video

(Slides as pdf)

Quiz

1. What is a "hotspot"?

A burnt spot on the CPU, caused by a badly fitted cooler
A code segment that is very inefficient
A code segment where the program spends most of its time

2. What does the Bottom-Up tab show?

Low level performance results that are close to the hardware ("bottom")
Time spent in each code section, with the quickest sections at the top
List of code sections (functions, loops) with their attributed measurements

3. What is a Flame Graph?

A graphical presentation of the call stack on a timeline
A statistic of dead CPUs from ill fitted coolers
A directed graph of a hotspots call stack. Functions are nodes and weighted edges encode the execution time

4. Can VTune present source code with the performance of each line?

Yes, by clicking on the function name in any of the tabs
Yes, by calling vtune /path/to/source.cxx from the commandline
No, VTune only analyses binaries

5. How do analysis types differ?

All analysis types are exactly the same
There is exactly one analysis type for each high level metric (Memory, Back-end, Front-end, etc.)
Analysis types may collect similar data, but the presentation focuses on a certain topic