I have one searching OpenCL 1.1 algorithm which works well with small amount of data:
1.) build the inputData array and pass it to the GPU
2.) c
You have not clearly indicated that you are using Windows as OS but I assume it since you have the VS2013 tag in your question.
The Nvidia card does not crash. On Windows you have Timeout Detection & Recovery (TDR) in the WDDM driver which restarts GPU drivers if they become unresponsive. You can disable this "feature" with Nsight easily. However, be aware that this may cause problems with your desktop environment, so make sure to write a kernel that will end in a tolerable amount of time. Then you can run your very long kernels even on Windows with Nvidias OpenCL implementation.