Difference between revisions of "InstructionOverhead"

From HPC Wiki
Jump to navigation Jump to search
m
 
(One intermediate revision by one other user not shown)
Line 1: Line 1:
 +
[[Category:Performance Pattern]]
 
== Description ==
 
== Description ==
 
The pattern "Instruction Overhead" describes the fact that for a piece of high-level code, the compiler outputs a lot of instructions although is could be done in less. One common example are non-vectorized instructions.  
 
The pattern "Instruction Overhead" describes the fact that for a piece of high-level code, the compiler outputs a lot of instructions although is could be done in less. One common example are non-vectorized instructions.  
Line 4: Line 5:
  
 
== Symptoms ==
 
== Symptoms ==
 +
Instruction Overhead causes a low application performance and a good scaling behavior across cores. The performance is insensitive to the problem size.
  
  
 
== Detection ==
 
== Detection ==
 +
* Low CPI value (near to theoretical limit)
 +
* Large non-FP instruction count (constant vs. number of cores)
  
  
 
== Possible optimizations and/or fixes ==
 
== Possible optimizations and/or fixes ==
 +
It depends on the kind of instructions. If the code is using scalar FP instructions, activate vectorization to reduce the number of instructions.
  
  
 
== Applicable applications or algorithms or kernels ==
 
== Applicable applications or algorithms or kernels ==

Latest revision as of 08:21, 4 September 2019

Description

The pattern "Instruction Overhead" describes the fact that for a piece of high-level code, the compiler outputs a lot of instructions although is could be done in less. One common example are non-vectorized instructions.


Symptoms

Instruction Overhead causes a low application performance and a good scaling behavior across cores. The performance is insensitive to the problem size.


Detection

  • Low CPI value (near to theoretical limit)
  • Large non-FP instruction count (constant vs. number of cores)


Possible optimizations and/or fixes

It depends on the kind of instructions. If the code is using scalar FP instructions, activate vectorization to reduce the number of instructions.


Applicable applications or algorithms or kernels