Difference between revisions of "Intel VTune Tutorial/CPU Architecture"

From HPC Wiki
Intel VTune Tutorial/CPU Architecture
Jump to navigation Jump to search
(Created page with "Intel VTune Tutorials: CPU Architectures<nowiki /> {{DISPLAYTITLE:Intel VTune Tutorial: CPU Architectures}}<nowiki /> {{Syllabus Intel VTune Tutorial}}...")
 
 
Line 4: Line 4:
 
__TOC__
 
__TOC__
  
The second tutorial of the Intel VTune seriers serves as a small interlude, providing some context about CPU architecture concepts.
+
The second tutorial of the Intel VTune series serves as a small interlude, providing some context about CPU architecture concepts.
 
It covers a brief description of the front-end and back-end of a CPU core.
 
It covers a brief description of the front-end and back-end of a CPU core.
 
These concepts are a necessary basis to understand the language of VTune profiles.
 
These concepts are a necessary basis to understand the language of VTune profiles.
Line 20: Line 20:
 
{
 
{
 
|type="()"}
 
|type="()"}
+ VTunes analysis results are often expressed in these terms.
+
+ VTunes analysis results are often expressed in these terms
 
|| Correct. It is good to know what "Front End", "Back End", "Bad Speculation" or "Retired Instructions" refer to!
 
|| Correct. It is good to know what "Front End", "Back End", "Bad Speculation" or "Retired Instructions" refer to!
- VTune measurements are always presented in a CPU diagram.
+
- VTune measurements are always presented in a CPU diagram
 
|| Not correct. The data is presented in different ways. This might be a useful gimmick though!
 
|| Not correct. The data is presented in different ways. This might be a useful gimmick though!
 
- It is impossible to use VTune without a thorough understanding of CPUs!
 
- It is impossible to use VTune without a thorough understanding of CPUs!
Line 59: Line 59:
 
{
 
{
 
|type="()"}
 
|type="()"}
- All metrics describe how efficiently data is moved from memory to the CPU.
+
- All metrics describe how efficiently data is moved from memory to the CPU
 
|| Not correct. This would not be very useful for programs performing complex computations on few data.
 
|| Not correct. This would not be very useful for programs performing complex computations on few data.
- Bad results always highlight inefficiencies in computation.
+
- Bad results always highlight inefficiencies in computation
 
|| Not correct. This would not cover performance in moving data, or instruction decoding and speculation in the front-end.
 
|| Not correct. This would not cover performance in moving data, or instruction decoding and speculation in the front-end.
+ Each metric focusses on a different aspect of the architecture.
+
+ Each metric focuses on a different aspect of the architecture
 
|| Correct. Each metric is affected by its own class of problems. The interpretation and approaches to improvement can be very different between each metric.
 
|| Correct. Each metric is affected by its own class of problems. The interpretation and approaches to improvement can be very different between each metric.
 
</quiz>
 
</quiz>
Line 72: Line 72:
 
{
 
{
 
|type="()"}
 
|type="()"}
- A source of data from network connections.
+
- A source of data from network connections
 
|| Not correct. A Network connection is a concept that mostly lives outside of the CPU architecture.
 
|| Not correct. A Network connection is a concept that mostly lives outside of the CPU architecture.
+ A channel to read/write data from memory or use co-processors.
+
+ A channel to read/write data from memory or use co-processors
|| Correct. The exectution engine moving data from the L1 cache through ports, for example.
+
|| Correct. The execution engine moving data from the L1 cache through ports, for example.
- CPUs have only a single port, connecting the front-end to the back-end.
+
- CPUs have only a single port, connecting the front-end to the back-end
 
|| Not correct. There are multiple ports.
 
|| Not correct. There are multiple ports.
 
</quiz>
 
</quiz>
 
{{hidden end}}
 
{{hidden end}}

Latest revision as of 09:40, 14 June 2022

Tutorial
Title: Intel VTune Tutorial
Provider: HPC.NRW

Contact: tutorials@hpc.nrw
Type: Multi-part video
Topic Area: Performance analysis
License: CC-BY-SA
Syllabus

1. Introduction
2. CPU Architecture
3. Analysis Types
4. Useful Tips

The second tutorial of the Intel VTune series serves as a small interlude, providing some context about CPU architecture concepts. It covers a brief description of the front-end and back-end of a CPU core. These concepts are a necessary basis to understand the language of VTune profiles.

Video

(Slides as pdf)

Quiz

1. How does knowing CPU architecture concepts help with using VTune?

VTunes analysis results are often expressed in these terms
VTune measurements are always presented in a CPU diagram
It is impossible to use VTune without a thorough understanding of CPUs!

2. What part of the CPU is executing instructions?

The Front-end
The Back-end
The Cache

3. What part of the CPU is preparing instructions for the execution?

The Back-end
The Cache
The Front-end

4. How are profile metrics related to the CPU architecture?

All metrics describe how efficiently data is moved from memory to the CPU
Bad results always highlight inefficiencies in computation
Each metric focuses on a different aspect of the architecture

5. What is a "port" in CPU architectures?

A source of data from network connections
A channel to read/write data from memory or use co-processors
CPUs have only a single port, connecting the front-end to the back-end