CodeCompositionIneffective

From HPC Wiki
Jump to navigation Jump to search
The printable version is no longer supported and may have rendering errors. Please update your browser bookmarks and please use the default browser print function instead.

Description

The pattern "Code composition - inefficient instructions" describes the usage of one kind of instructions although there exists a better kind, e.g. scalar vs. vectorized FP instructions.

Symptoms

Like Instruction Overhead.

For a piece of high-level code, the compiler outputs a lot of instructions although is could be done in less. One common example are non-vectorized instructions.


Detection

For code with FP arithmetic, check whether scalar instructions are dominating in data parallel loops.

LIKWID groups: FLOPS_DP, FLOPS_SP (see vectorization ratio)


Possible optimizations and/or fixes

  • Try to give hints to the compiler so that it can use the more efficient instructions. (E.g. #pragma simd)
  • Reorganize access pattern
  • Avoid loop carried dependecies


Applicable applications or algorithms or kernels