I\'m trying to figure out an efficient way to load compile time constant floats into SSE(2/3) registers. I\'ve tried doing simple code like this,
const __m128 x
Normally constants such as this would be loaded prior to any loops or "hot" parts of the code, so performance should not be that important. But if you can't avoid doing this kind of thing inside a loop then I would try _mm_set_ps
first and see what that generates. Also try ICC rather than gcc, as it tends to generate better code.