What is/are the fastest memset() alternatives for OpenCL?

情到浓时终转凉″ 提交于 2019-12-01 21:39:20

You can use clEnqueueFillBuffer() from OpenCL v1.2. That is exactly what you need. And it is very flexible on how to fill the buffer with patterns.

Here the doc page:

http://www.khronos.org/registry/cl/sdk/1.2/docs/man/xhtml/clEnqueueFillBuffer.html

If you are on 1.1 or below.... then you should recur to other approaches.

A great way to do this very fast (if you have extra memory available) is to have a pre-sized initialized array (such as one filled with all zeros) and then do an on device copy any time you need to zero out the buffer. In my experience this was much faster than any of the calls to fill in OpenCL or CUDA. Obviously this is a special case but much faster when I last tested it.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!