Grid of Thread Blocks
In a previous article, I gave
an introduction to programming with CUDA. Now I’d like to go into a little bit more depth about the CUDA thread execution model and the architecture of a CUDA enabled GPU. I assume that the reader has basic knowledge about CUDA and already knows how to setup a project that uses the CUDA runtime API. If you don’t know how to setup a project with CUDA, you can refer to my previous article:
Introduction to CUDA.
Continue reading →