SIMD-8,SIMD-16 or SIMD-32 in opencl on gpgpu
I read couple of questions on SO for this topic(SIMD Mode), but still slight clarification/confirmation of how things work is required. Why use SIMD if we have GPGPU? SIMD intrinsics - are they usable on gpus? CPU SIMD vs GPU SIMD? Are following points correct,if I compile the code in SIMD-8 mode ? 1) it means 8 instructions of different work items are getting executing in parallel. 2) Does it mean All work items are executing the same instruction only? 3) if each wrok item code contains vload16 load then float16 operations and then vstore16 operations only. SIMD-8 mode will still work. I mean