cuda-streams

Concurrent, unique kernels on the same multiprocessor?

我的梦境 提交于 2019-12-02 07:49:03
问题 Is it possible, using streams, to have multiple unique kernels on the same streaming multiprocessor in Kepler 3.5 GPUs? I.e. run 30 kernels of size <<<1,1024>>> at the same time on a Kepler GPU with 15 SMs? 回答1: On a compute capability 3.5 device, it might be possible. Those devices support up to 32 concurrent kernels per GPU and 2048 threads peer multi-processor. With 64k registers per multi-processor, two blocks of 1024 threads could run concurrently if their register footprint was less