Non-Uniform Memory Access (NUMA) Effects in OpenMP
OpenMP in Small Bites/NUMA /
Jump to navigation
Jump to search
Revision as of 15:53, 4 December 2020 by Marc-andre-hermanns-bc32@rwth-aachen.de (talk | contribs) (Add DISPLAYTITLE)
Tutorial | |
---|---|
Title: | OpenMP in Small Bites |
Provider: | HPC.NRW
|
Contact: | tutorials@hpc.nrw |
Type: | Multi-part video |
Topic Area: | Programming Paradigms |
License: | CC-BY-SA |
Syllabus
| |
1. Overview | |
2. Worksharing | |
3. Data Scoping | |
4. False Sharing | |
5. Tasking | |
6. Tasking and Data Scoping | |
7. Tasking and Synchronization | |
8. Loops and Tasks | |
9. Tasking Example: Sudoku Solver | |
10. Task Scheduling | |
11. Non-Uniform Memory Access |
This video shows how a non-uniform memory access (NUMA) architecture influences the performance of OpenMP programs. It explains how distribute data and threads across NUMA domains and how to avoid uncontrolled data or thread migration.
Video
Quiz
1. Why is it important to initialize your data in parallel when executing on a NUMA architecture?
2. Why is it important to bind the threads?
3. Given a NUMA architecture with to two sockets with six cores each: How can you place the threads of an OpenMP program running with 4 threads among both sockets and bind them to a core?