Is there a limit to OpenCL local memory?

后端 未结 3 970
渐次进展
渐次进展 2021-01-02 04:58

Today I added four more __local variables to my kernel to dump intermediate results in. But just adding the four more variables to the kernel\'s signature and a

3条回答
  •  遥遥无期
    2021-01-02 05:49

    Of course there is, since local memory is physical rather than virtual.

    We are used, from working with a virtual address space on CPUs, to theoretically have as much memory as we want - potentially failing at very large sizes due to paging file / swap partition running out, or maybe not even that, until we actually try to use too much memory so that it can't be mapped to the physical RAM and the disk.

    This is not the case for things like a computer's OS kernel (or lower-level parts of it) which need to access specific areas in the actual RAM.

    It is also not the case for GPU global and local memory. There is no* memory paging (remapping of perceived thread addresses to physical memory addresses); and no swapping. Specifically regarding local memory, every compute unit (= every symmetric multiprocessor on a GPU) has a bunch of RAM used as local memory; the green slabs here:

    enter image description here

    the size of each such slab is what you get with

    clGetDeviceInfo( · , CL_DEVICE_LOCAL_MEM_SIZE, · , ·).

    To illustrate, on nVIDIA Kepler GPUs, the local memory size is either 16 KBytes or 48 KBytes (and the complement to 64 KBytes is used for caching accesses to Global Memory). So, as of today, GPU local memory is very small relative to the global device memory.


    1 - On nVIDIA GPUs beginning with the Pascal architecture, paging is supported; but that's not the common way of using device memory.

提交回复
热议问题