Are GPU/CUDA cores SIMD ones?
Let's take the nVidia Fermi Compute Architecture . It says: The first Fermi based GPU, implemented with 3.0 billion transistors, features up to 512 CUDA cores. A CUDA core executes a floating point or integer instruction per clock for a thread. The 512 CUDA cores are organized in 16 SMs of 32 cores each. [...] Each CUDA processor has a fully pipelined integer arithmetic logic unit (ALU) and floating point unit (FPU). [...] In Fermi, the newly designed integer ALU supports full 32-bit precision for all instructions, consistent with standard programming language requirements. The integer ALU is