<?xml version="1.0"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en">
	<id>https://hpc-wiki.info/hpc/index.php?action=history&amp;feed=atom&amp;title=NUMA</id>
	<title>NUMA - Revision history</title>
	<link rel="self" type="application/atom+xml" href="https://hpc-wiki.info/hpc/index.php?action=history&amp;feed=atom&amp;title=NUMA"/>
	<link rel="alternate" type="text/html" href="https://hpc-wiki.info/hpc/index.php?title=NUMA&amp;action=history"/>
	<updated>2026-04-16T05:00:03Z</updated>
	<subtitle>Revision history for this page on the wiki</subtitle>
	<generator>MediaWiki 1.35.9</generator>
	<entry>
		<id>https://hpc-wiki.info/hpc/index.php?title=NUMA&amp;diff=2052&amp;oldid=prev</id>
		<title>Daniel-schurhoff-de23@rwth-aachen.de at 13:19, 3 September 2019</title>
		<link rel="alternate" type="text/html" href="https://hpc-wiki.info/hpc/index.php?title=NUMA&amp;diff=2052&amp;oldid=prev"/>
		<updated>2019-09-03T13:19:53Z</updated>

		<summary type="html">&lt;p&gt;&lt;/p&gt;
&lt;table class=&quot;diff diff-contentalign-left diff-editfont-monospace&quot; data-mw=&quot;interface&quot;&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;tr class=&quot;diff-title&quot; lang=&quot;en&quot;&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #202122; text-align: center;&quot;&gt;← Older revision&lt;/td&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #202122; text-align: center;&quot;&gt;Revision as of 13:19, 3 September 2019&lt;/td&gt;
				&lt;/tr&gt;&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot; id=&quot;mw-diff-left-l1&quot; &gt;Line 1:&lt;/td&gt;
&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 1:&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt; &lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt;+&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;[[Category:HPC-User]]&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&#039;diff-marker&#039;&gt; &lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;NUMA (short for nun-uniform memory access) is a memory architecture which is popular in HPC. A typical [[HPC-Dictionary#Cluster|cluster]] consists of hundreds of nodes where each individual node is a NUMA-system.&lt;/div&gt;&lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt; &lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;NUMA (short for nun-uniform memory access) is a memory architecture which is popular in HPC. A typical [[HPC-Dictionary#Cluster|cluster]] consists of hundreds of nodes where each individual node is a NUMA-system.&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&#039;diff-marker&#039;&gt; &lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt; &lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;/td&gt;&lt;/tr&gt;

&lt;!-- diff cache key hpc_wiki:diff::1.12:old-1114:rev-2052 --&gt;
&lt;/table&gt;</summary>
		<author><name>Daniel-schurhoff-de23@rwth-aachen.de</name></author>
	</entry>
	<entry>
		<id>https://hpc-wiki.info/hpc/index.php?title=NUMA&amp;diff=1114&amp;oldid=prev</id>
		<title>Stefan-erlbeck-05df@rwth-aachen.de at 13:44, 28 November 2018</title>
		<link rel="alternate" type="text/html" href="https://hpc-wiki.info/hpc/index.php?title=NUMA&amp;diff=1114&amp;oldid=prev"/>
		<updated>2018-11-28T13:44:03Z</updated>

		<summary type="html">&lt;p&gt;&lt;/p&gt;
&lt;table class=&quot;diff diff-contentalign-left diff-editfont-monospace&quot; data-mw=&quot;interface&quot;&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;tr class=&quot;diff-title&quot; lang=&quot;en&quot;&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #202122; text-align: center;&quot;&gt;← Older revision&lt;/td&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #202122; text-align: center;&quot;&gt;Revision as of 13:44, 28 November 2018&lt;/td&gt;
				&lt;/tr&gt;&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot; id=&quot;mw-diff-left-l4&quot; &gt;Line 4:&lt;/td&gt;
&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 4:&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&#039;diff-marker&#039;&gt; &lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt; &lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&#039;diff-marker&#039;&gt; &lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;== General ==&lt;/div&gt;&lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt; &lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;== General ==&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt; &lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt;+&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt; &lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt;+&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;[[File:NUMA Architecture.png|thumb|300px|A sketch of the NUMA architecture showing the different sockets, their local memory and the interconnect]]&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt; &lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt;+&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&#039;diff-marker&#039;&gt; &lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;Desktop computers normally consist of one motherboard, one CPU die (with several cores) and one main memory. On the other hand, a NUMA system consists of one motherboard with several sockets and each socket holds one CPU die. Furthermore, each socket is local to a certain part of the main memory. A NUMA system is still a shared memory system, which means that every core on every socket can access each part of the main memory. However, accessing a non-local part of the main memory takes longer because a special interconnect has to be used (hence the name NUMA).&lt;/div&gt;&lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt; &lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;Desktop computers normally consist of one motherboard, one CPU die (with several cores) and one main memory. On the other hand, a NUMA system consists of one motherboard with several sockets and each socket holds one CPU die. Furthermore, each socket is local to a certain part of the main memory. A NUMA system is still a shared memory system, which means that every core on every socket can access each part of the main memory. However, accessing a non-local part of the main memory takes longer because a special interconnect has to be used (hence the name NUMA).&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&#039;diff-marker&#039;&gt; &lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt; &lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;/td&gt;&lt;/tr&gt;

&lt;!-- diff cache key hpc_wiki:diff::1.12:old-1110:rev-1114 --&gt;
&lt;/table&gt;</summary>
		<author><name>Stefan-erlbeck-05df@rwth-aachen.de</name></author>
	</entry>
	<entry>
		<id>https://hpc-wiki.info/hpc/index.php?title=NUMA&amp;diff=1110&amp;oldid=prev</id>
		<title>Stefan-erlbeck-05df@rwth-aachen.de at 11:07, 27 November 2018</title>
		<link rel="alternate" type="text/html" href="https://hpc-wiki.info/hpc/index.php?title=NUMA&amp;diff=1110&amp;oldid=prev"/>
		<updated>2018-11-27T11:07:28Z</updated>

		<summary type="html">&lt;p&gt;&lt;/p&gt;
&lt;table class=&quot;diff diff-contentalign-left diff-editfont-monospace&quot; data-mw=&quot;interface&quot;&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;tr class=&quot;diff-title&quot; lang=&quot;en&quot;&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #202122; text-align: center;&quot;&gt;← Older revision&lt;/td&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #202122; text-align: center;&quot;&gt;Revision as of 11:07, 27 November 2018&lt;/td&gt;
				&lt;/tr&gt;&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot; id=&quot;mw-diff-left-l1&quot; &gt;Line 1:&lt;/td&gt;
&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 1:&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&#039;diff-marker&#039;&gt;−&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;NUMA (short for nun-uniform memory access) is a memory architecture which is popular in HPC. A typical cluster consists of hundreds of nodes where each individual node is a NUMA-system.&lt;/div&gt;&lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt;+&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;NUMA (short for nun-uniform memory access) is a memory architecture which is popular in HPC. A typical &lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;[[HPC-Dictionary#Cluster|&lt;/ins&gt;cluster&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;]] &lt;/ins&gt;consists of hundreds of nodes where each individual node is a NUMA-system.&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&#039;diff-marker&#039;&gt; &lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt; &lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&#039;diff-marker&#039;&gt; &lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;__TOC__&lt;/div&gt;&lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt; &lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;__TOC__&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;

&lt;!-- diff cache key hpc_wiki:diff::1.12:old-1109:rev-1110 --&gt;
&lt;/table&gt;</summary>
		<author><name>Stefan-erlbeck-05df@rwth-aachen.de</name></author>
	</entry>
	<entry>
		<id>https://hpc-wiki.info/hpc/index.php?title=NUMA&amp;diff=1109&amp;oldid=prev</id>
		<title>Stefan-erlbeck-05df@rwth-aachen.de: Created page with &quot;NUMA (short for nun-uniform memory access) is a memory architecture which is popular in HPC. A typical cluster consists of hundreds of nodes where each individual node is a NU...&quot;</title>
		<link rel="alternate" type="text/html" href="https://hpc-wiki.info/hpc/index.php?title=NUMA&amp;diff=1109&amp;oldid=prev"/>
		<updated>2018-11-27T11:03:06Z</updated>

		<summary type="html">&lt;p&gt;Created page with &amp;quot;NUMA (short for nun-uniform memory access) is a memory architecture which is popular in HPC. A typical cluster consists of hundreds of nodes where each individual node is a NU...&amp;quot;&lt;/p&gt;
&lt;p&gt;&lt;b&gt;New page&lt;/b&gt;&lt;/p&gt;&lt;div&gt;NUMA (short for nun-uniform memory access) is a memory architecture which is popular in HPC. A typical cluster consists of hundreds of nodes where each individual node is a NUMA-system.&lt;br /&gt;
&lt;br /&gt;
__TOC__&lt;br /&gt;
&lt;br /&gt;
== General ==&lt;br /&gt;
Desktop computers normally consist of one motherboard, one CPU die (with several cores) and one main memory. On the other hand, a NUMA system consists of one motherboard with several sockets and each socket holds one CPU die. Furthermore, each socket is local to a certain part of the main memory. A NUMA system is still a shared memory system, which means that every core on every socket can access each part of the main memory. However, accessing a non-local part of the main memory takes longer because a special interconnect has to be used (hence the name NUMA).&lt;br /&gt;
&lt;br /&gt;
== Advantages ==&lt;br /&gt;
NUMA replaced the older SMP (symmetric multiprocessing, sometimes also called UMA) in most HPC clusters because of several reasons:&lt;br /&gt;
*NUMA systems can contain more CPU cores.&lt;br /&gt;
*NUMA systems can have a larger main memory.&lt;br /&gt;
*NUMA systems have a higher possible bandwidth.&lt;br /&gt;
Without NUMA, there is only limited space which is close to the main memory. In NUMA systems, the main memory can be split into several parts. This allows more cores and overall more main memory. The last advantage is an inherent property of the design. As long as all cores only access local main memory, each socket can access its own main memory simultaneously with the same bandwidth.&lt;br /&gt;
&lt;br /&gt;
== Pitfalls ==&lt;br /&gt;
Due to the more complex design of NUMA, there exist two pitfalls for a programmer which is unaware of the NUMA architecture. These pitfalls may result in code which does not scale well. Both are caused by a extensive usage of the interconnect, which may prove to be a huge bottleneck.&lt;br /&gt;
&lt;br /&gt;
The first problem is thread migration. In general, the operating system is allowed to move threads (or processes) between cores if it detects that the cores have different workloads. This can improve load balancing but causes additional costs. However, this is usually not wanted on NUMA systems. The reason is that if a thread which works on data in the core&amp;#039;s local main memory gets moved to a different socket, it will get separated from its data. This means that the migrated thread has to use the slow interconnect all the time. The solution to this problem is [[Binding/Pinning|pinning]].&lt;br /&gt;
&lt;br /&gt;
The second problem is data placement. If the programmer allocates the memory unaware of the NUMA architecture, all threads which are located on non-local sockets have to use the interconnect. The solution is provided with the so-called &amp;#039;&amp;#039;first-touch policy&amp;#039;&amp;#039;. Since the operating system is aware of the NUMA architecture, it allocates the memory local to the threads which first &amp;quot;touched&amp;quot; the memory. &lt;br /&gt;
&lt;br /&gt;
The following example shows a simple NUMA-aware array addition:&lt;br /&gt;
 &amp;lt;syntaxhighlight lang=&amp;quot;c++&amp;quot;&amp;gt;&lt;br /&gt;
 #pragma omp parallel for&lt;br /&gt;
 for(int i = 0; i &amp;lt; N; i++)&lt;br /&gt;
 {&lt;br /&gt;
     a[i] = 0.0;&lt;br /&gt;
     b[i] = i;&lt;br /&gt;
 }&lt;br /&gt;
&lt;br /&gt;
 #pragma omp parallel for&lt;br /&gt;
 {&lt;br /&gt;
     a[i] = a[i] + b[i];&lt;br /&gt;
 }&lt;br /&gt;
 &amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
Assuming thread affinity (pinning) was defined correctly, the first loop will distribute the arrays &amp;#039;&amp;#039;a&amp;#039;&amp;#039; and &amp;#039;&amp;#039;b&amp;#039;&amp;#039; across the different parts of the main memory in a way that each thread participating is local to its own chunk of data. This way, the addition only requires to access local data.&lt;br /&gt;
&lt;br /&gt;
== Further Reading ==&lt;br /&gt;
https://software.intel.com/en-us/articles/optimizing-applications-for-numa&lt;/div&gt;</summary>
		<author><name>Stefan-erlbeck-05df@rwth-aachen.de</name></author>
	</entry>
</feed>