CUDA for Wave Simulation - Part 3: Efficient CUDA
Non-trivial CUDA The current snippet of code calling CUDA is: cuda_step<<< 1, 1 >>> This uses only one CUDA thread, and is probably extremely inefficient. To verify this, let’s add some way to time the program. I could use NVIDIA’s nsys, or just the...
Aug 31, 20258 min read9