CodeCompositionIneffective
Revision as of 15:17, 3 September 2019 by Daniel-schurhoff-de23@rwth-aachen.de (talk | contribs)
The printable version is no longer supported and may have rendering errors. Please update your browser bookmarks and please use the default browser print function instead.
Description
The pattern "Code composition - inefficient instructions" describes the usage of one kind of instructions although there exists a better kind, e.g. scalar vs. vectorized FP instructions.
Symptoms
Like Instruction Overhead.
For a piece of high-level code, the compiler outputs a lot of instructions although is could be done in less. One common example are non-vectorized instructions.
Detection
For code with FP arithmetic, check whether scalar instructions are dominating in data parallel loops.
LIKWID groups: FLOPS_DP, FLOPS_SP (see vectorization ratio)
Possible optimizations and/or fixes
- Try to give hints to the compiler so that it can use the more efficient instructions. (E.g.
#pragma simd
) - Reorganize access pattern
- Avoid loop carried dependecies