How to create Large Bit array in cuda?

…衆ロ難τιáo~ 提交于 2019-12-24 08:49:23

问题


I need to keep track of around 10000 elements of an array in my algorithm .So for this I need boolean for each record.If I used char array to keep track of 10000 arrays (as 0/1),it would take up lot of memory.

So Can I create an bit array of 10000 bits in Cuda where each bit represents corresponding array record?


回答1:


As Roger said, the answer is yes, CUDA provides the same bitwise operations (i.e. >>, << and &) as normal C so you can implement your bit array essentially normally (almost, see thread synchronisation issues below).


However, for your situation it is almost certainly not a good idea.

There are problems with thread syncronisation. Imagine two of the threads on your GPU are inverting two bits of a single entry of your array. Each thread will read the same value out of memory, and apply their operation to it, but the thread that writes its value back to memory last will overwrite the result of the other thread. (Note: if your bit array is not being modified by the GPU code then this isn't a problem.)

And, unless this is explicitly required, you shouldn't be optimising for memory use, an array with 10K elements does not take much memory at all: even if you were storing each boolean in an 64 bit integer it's only 80 KB. And obviously you can store them in a smaller datatype. (You should only start worrying about compressing the array as much as possible when you are getting upwards of tens of millions, or even hundreds of millions of elements.)

Also, the way GPUs work means that you might get best performance by using a reasonably large data type for each boolean (most likely a 32 bit one) so that, for example, memory coalescing works better. (I haven't tested this assertion, you would need to run some benchmarks to check it.)



来源:https://stackoverflow.com/questions/11042816/how-to-create-large-bit-array-in-cuda

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!