Difference between revisions of "GPU Tutorial/SAXPY CUDA C"

From HPC Wiki
GPU Tutorial/SAXPY CUDA C
Jump to navigation Jump to search
m
m
Line 18: Line 18:
 
{
 
{
 
|type="()"}
 
|type="()"}
+ serial programs
+
- serial programs
 
|| Correct: CPU: optimized for low latency (strong single thread); GPU: optimized for throughput (massive parallelism)
 
|| Correct: CPU: optimized for low latency (strong single thread); GPU: optimized for throughput (massive parallelism)
- parallel programs
+
+ parallel programs
 
|| Wrong: CPU: optimized for low latency (strong single thread); GPU: optimized for throughput (massive parallelism)
 
|| Wrong: CPU: optimized for low latency (strong single thread); GPU: optimized for throughput (massive parallelism)
 
</quiz>
 
</quiz>

Revision as of 17:15, 10 November 2021

Tutorial
Title: Introduction to GPU Computing
Provider: HPC.NRW

Contact: tutorials@hpc.nrw
Type: Multi-part video
Topic Area: GPU computing
License: CC-BY-SA
Syllabus

1. Introduction
2. Several Ways to SAXPY: CUDA C/C++
3. Several Ways to SAXPY: OpenMP
4. Several Ways to SAXPY: Julia
5. Several Ways to SAXPY: NUMBA

This video discusses the SAXPY via NVIDIA CUDA C/C++.

Video

(Slides as pdf)

Quiz

1. For which kind of programm can we expect improvements with GPUs?

serial programs
parallel programs


2. What does GPU stands for?

graphics processing unit
grand powerful unit


3. Why do we expect an onverhead in the GPU timings?

The data must be copied to an extra device first and has to be transferred back later
A GPU core is "weaker" than a CPU core
For "small" problems like the SAXPY, the whole power of a GPU is rarely used
All of the above