BadDataPlacement

From HPC Wiki
Revision as of 16:15, 3 September 2019 by Daniel-schurhoff-de23@rwth-aachen.de (talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search
The printable version is no longer supported and may have rendering errors. Please update your browser bookmarks and please use the default browser print function instead.

Description

The pattern "Bad data placement" describes the performance limitation caused by data residing in remote locations with higher access times. At node level, the common problem arises by initializing the data using the wrong processing units (first touch memory allocation policy places the data next to the issuing processing unit) which might not be the processing unit actually working with the data. When data is located on a different compute node in cluster systems, the data needs to be transferred through the network every time to reach the compute node that requires the data for progress.

Symptoms

At system level:

  • bad or no scaling across NUMA domains

At cluster level:

  • Additional communication to request remote data

Detection

At system level use a hardware-counter tool like:

  • LIKWID with performance groups MEM and NUMA
  • Same information can be provided by perf or PAPI

At cluster level:

  • Use a communication profiler like Vampir, TAU or HPCToolkit

Possible optimizations and/or fixes

  • Reorganize data layout and allocation schemes
  • On node level, try interleaved memory policy (numactl -i <domains> <application>)
  • On cluster level, try to distribute data early to reduce access times and balance remote access times against recalculation on the local node