course: Fundamentals of GPU Programming

teaching methods:
Videoübertragung, Moodle
responsible person:
Prof. Dr. Ralf Peter Brinkmann
Dr. Denis Eremin (ETIT)
offered in:
winter term

dates in winter term

  • start: Thursday the 14.10.2021
  • lecture Thursdays: from 14:15 to 15.45 o'clock in ID 04/653


Form of exam:project
Registration for exam:FlexNow
continual assessment


The students know how to program on graphics processing units (GPUs)


Upon a certain point in time around 2003, the computational performance has been increasing not at the expense of boosting the processor tact frequency, but rather through increase of the number of computational cores allocated on the processor chip. Graphics processing units (GPUs) are champions of this computer hardware evolution, boasting up to tens of thousands single core units. At the same time, the GPU memory system is not so much constrained by the compatibility requirements with older generations as CPU memory systems do. As a result, GPUs arguably demonstrate far better raw performance of the arithmetical units and the memory system compared to their older "brother", central processing units (CPUs). While originally designed for video processing tasks, the enormous computational power of modern GPUs is commonly used to assist to CPUs or to take main part in solving a large variety of computational problems featuring (massively) parallelizable sections, bringing teraflops scale high performance computing powers to laptop/desktop computers. The present course shows how CUDA C (extension of the C language designed for GPU programming) and the corresponding (very flexible!) CUDA runtime API framework can be used to accelerate execution of some typical programming patterns by a factor of 10 or more compared to that of CPU. Starting from the CUDA Programming Model, one will proceed to the CUDA Execution Model and go on further to consideration of fundamental conceptual, software and hardware issues helping to understand how GPUs work. Case studies of several problems involving massively parallel algorithms implemented on GPUs will be also elaborated on. The theoretical knowledge given during the lectures will be reinforced by a large number of practical examples that students will be able to work on at home.



recommended knowledge

C (programming language)


The course will be taught in class