Programming GPUs is slightly different from programming for CPUs, partly owing to their architecture. There are two general paradigms for GPU programming:
- Graphics programming, which is self-explanatory.
- And everything else, general-purpose computing on GPUs (GPGPU).
We have a few general workflows that hold regardless of GPU architecture or task (note how similar this looks for training deep learning models in CUDA):
- Allocate memory on the GPU.
- Copy data from the CPU to the GPU.
- Run the kernel, compute.
- Copy data back to the CPU, and free GPU memory.
We generally use a proprietary compiler to compile our programs into GPU-specific machine code. For Nvidia GPUs, this is done with NVCC.