CodeCompositionIneffective

From HPC Wiki
Revision as of 16:17, 3 September 2019 by Daniel-schurhoff-de23@rwth-aachen.de (talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Description

The pattern "Code composition - inefficient instructions" describes the usage of one kind of instructions although there exists a better kind, e.g. scalar vs. vectorized FP instructions.

Symptoms

Like Instruction Overhead.

For a piece of high-level code, the compiler outputs a lot of instructions although is could be done in less. One common example are non-vectorized instructions.


Detection

For code with FP arithmetic, check whether scalar instructions are dominating in data parallel loops.

LIKWID groups: FLOPS_DP, FLOPS_SP (see vectorization ratio)


Possible optimizations and/or fixes

  • Try to give hints to the compiler so that it can use the more efficient instructions. (E.g. #pragma simd)
  • Reorganize access pattern
  • Avoid loop carried dependecies


Applicable applications or algorithms or kernels