From HPC Wiki
Jump to navigation Jump to search


The pattern "Bad data placement" describes the performance limitation caused by data residing in remote locations with higher access times. At node level, the common problem arises by initializing the data using the wrong processing units (first touch memory allocation policy places the data next to the issuing processing unit) which might not be the processing unit actually working with the data. When data is located on a different compute node in cluster systems, the data needs to be transferred through the network every time to reach the compute node that requires the data for progress.


At system level:

  • bad or no scaling across NUMA domains

At cluster level:

  • Additional communication to request remote data


At system level use a hardware-counter tool like:

  • LIKWID with performance groups MEM and NUMA
  • Same information can be provided by perf or PAPI

At cluster level:

  • Use a communication profiler like Vampir, TAU or HPCToolkit

Possible optimizations and/or fixes

  • Reorganize data layout and allocation schemes
  • On node level, try interleaved memory policy (numactl -i <domains> <application>)
  • On cluster level, try to distribute data early to reduce access times and balance remote access times against recalculation on the local node