Difference between revisions of "GPU Tutorial/SAXPY CUDA C"
GPU Tutorial/SAXPY CUDA C
Jump to navigation
Jump to search
m |
|||
| (5 intermediate revisions by the same user not shown) | |||
| Line 5: | Line 5: | ||
This video discusses the SAXPY via NVIDIA CUDA C/C++. | This video discusses the SAXPY via NVIDIA CUDA C/C++. | ||
| + | CUDA is an application programming interface (API) for NVIDIA GPUs. In general, CUDA works with many programming languages, but this tutorial is going to focus on C/C++. CUDA gives access to a GPUs instruction set, which means we have to go through everything step-by-step, since many things do not happen automatically. | ||
=== Video === <!--T:5--> | === Video === <!--T:5--> | ||
| Line 20: | Line 21: | ||
{ | { | ||
|type="()"} | |type="()"} | ||
| − | - functions | + | - new functions |
|| CUDA does not only add new functions, but all of these features. | || CUDA does not only add new functions, but all of these features. | ||
| − | - syntax | + | - new syntax |
|| CUDA does not only add new syntax, but all of these features. | || CUDA does not only add new syntax, but all of these features. | ||
- GPU support | - GPU support | ||
| Line 53: | Line 54: | ||
|type="()"} | |type="()"} | ||
- __host__ | - __host__ | ||
| − | || Wrong | + | || Wrong. This specifies a function that runs on the CPU. |
- __device__ | - __device__ | ||
| − | || Wrong | + | || Wrong. This indeed does specify a function that runs on the GPU, but it also needs to be called from the GPU, while we want a kernel to be launched by the CPU. |
+ __global__ | + __global__ | ||
|| Correct | || Correct | ||
- __GPU__ | - __GPU__ | ||
| − | || Wrong | + | || Wrong. This modifier doesn't exist. |
</quiz> | </quiz> | ||
{{hidden end}} | {{hidden end}} | ||
| Line 68: | Line 69: | ||
{ | { | ||
|type="()"} | |type="()"} | ||
| − | - MyKernel() | + | - MyKernel(); |
| − | || Wrong | + | || Wrong. This would just execute an ordinary function. |
| − | - CUDA.run(NoBlocks, NoThreads, MyKernel()) | + | - CUDA.run(NoBlocks, NoThreads, MyKernel()); |
| − | || Wrong | + | || Wrong. There is no CUDA.run() |
| − | + <<<NoBlocks, NoThreads>>>MyKernel() | + | + <<<NoBlocks, NoThreads>>>MyKernel(); |
|| Correct | || Correct | ||
| − | - __global(NoBlocks, NoThreads)__ MyKernel() | + | - __global(NoBlocks, NoThreads)__ MyKernel(); |
| − | || Wrong | + | || Wrong. __global__ and other modifiers cant have arguments and are part of a function definition, not launch. |
</quiz> | </quiz> | ||
{{hidden end}} | {{hidden end}} | ||
| Line 81: | Line 82: | ||
{{hidden begin | {{hidden begin | ||
| − | |title = 5. Inside your kernel function, how do you distribute your data over the threads?}} | + | |title = 5. Inside your kernel function, how do you distribute your data over the GPU threads?}} |
<quiz display=simple> | <quiz display=simple> | ||
{ | { | ||
| Line 89: | Line 90: | ||
+ Each thread has has an index attached to it, which is addressed via threadIdx.x | + Each thread has has an index attached to it, which is addressed via threadIdx.x | ||
|| Correct | || Correct | ||
| − | - If you use array-element-wise operations | + | - If you use array-element-wise operations, e.g.: y.=a.*x.+b . This is managed by the NVIDIA preprocessor. |
| − | || Wrong | + | || Wrong. There are no element-wise operators in C/C++ |
- You flag a line to be parallelized via keywords, e.g.: __device__ y=a*x+b | - You flag a line to be parallelized via keywords, e.g.: __device__ y=a*x+b | ||
| − | || Wrong | + | || Wrong. These modifiers are used at function definitions. |
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
</quiz> | </quiz> | ||
{{hidden end}} | {{hidden end}} | ||
Latest revision as of 11:17, 3 January 2022
| Tutorial | |
|---|---|
| Title: | Introduction to GPU Computing |
| Provider: | HPC.NRW
|
| Contact: | tutorials@hpc.nrw |
| Type: | Multi-part video |
| Topic Area: | GPU computing |
| License: | CC-BY-SA |
| Syllabus
| |
| 1. Introduction | |
| 2. Several Ways to SAXPY: CUDA C/C++ | |
| 3. Several Ways to SAXPY: OpenMP | |
| 4. Several Ways to SAXPY: Julia | |
| 5. Several Ways to SAXPY: NUMBA | |
This video discusses the SAXPY via NVIDIA CUDA C/C++. CUDA is an application programming interface (API) for NVIDIA GPUs. In general, CUDA works with many programming languages, but this tutorial is going to focus on C/C++. CUDA gives access to a GPUs instruction set, which means we have to go through everything step-by-step, since many things do not happen automatically.
Video
Quiz
1. Which features does CUDA add to C/C++?
2. What is a kernel?
3. How do you flag a function to be a kernel?
4. Let's say you coded your kernel function called "MyKernel". How do you run it?
5. Inside your kernel function, how do you distribute your data over the GPU threads?