Generalized Hough Transform in CUDA - How can I speed up the binning process?
问题 Like the title says, I'm working on a little personal research into parallel computer vision techniques. Using CUDA, I am trying to implement a GPGPU version of the Hough transform. The only problem that I've encountered is during the voting process. I'm calling atomicAdd() to prevent multiple, simultaneously write operations and I don't seem to be gaining too much performance efficiency. I've searched the web, but haven't found any way to noticeably enhance the performance of the voting