Difference between revisions of "GPU Tutorial/Open MP"

From HPC Wiki
GPU Tutorial/Open MP
Jump to navigation Jump to search
Line 4: Line 4:
 
__TOC__
 
__TOC__
  
 +
This video discusses the SAXPY via Open MP GPU offloading.
 
OpenMP 4.0 and later enables developers to program GPUs in C/C++ and Fortran by means of OpenMP directives. In this tutorial we present the basic OpenMP syntax for GPU offloading and give a step-by-step guide for implementing SAXPY with it.
 
OpenMP 4.0 and later enables developers to program GPUs in C/C++ and Fortran by means of OpenMP directives. In this tutorial we present the basic OpenMP syntax for GPU offloading and give a step-by-step guide for implementing SAXPY with it.
  

Revision as of 11:18, 3 January 2022

Tutorial
Title: Introduction to GPU Computing
Provider: HPC.NRW

Contact: tutorials@hpc.nrw
Type: Multi-part video
Topic Area: GPU computing
License: CC-BY-SA
Syllabus

1. Introduction
2. Several Ways to SAXPY: CUDA C/C++
3. Several Ways to SAXPY: OpenMP
4. Several Ways to SAXPY: Julia
5. Several Ways to SAXPY: NUMBA

This video discusses the SAXPY via Open MP GPU offloading. OpenMP 4.0 and later enables developers to program GPUs in C/C++ and Fortran by means of OpenMP directives. In this tutorial we present the basic OpenMP syntax for GPU offloading and give a step-by-step guide for implementing SAXPY with it.

Video

Quiz

1. Which one of the following OpenMP directives can create a target region on GPU?

`#pragma omp target gpu`
`#pragma omp target acc`
`#pragma omp target`


2. The OpenMP `map(to:...)` clause maps variables:

from host to device data environment before execution
from host to device data environment after execution
from device to host data environment before execution
from device to host data environment after execution


3. Which one of the following OpenMP directives can initialize a league of teams for execution on GPU?

`#pragma omp init teams`
`#pragma omp teams`
`#pragma omp gpu teams`


4. Which one of the following OpenMP directives can distribute iterations of for-loop across GPU threads in the teams?

`#pragma omp distribute for`
`#pragma omp parallel for`
`#pragma omp distribute parallel for`