For XMM/YMM FP operation on Intel Haswell, can FMA be used in place of ADD?
问题 This question is for packed, single-prec floating ops with XMM/YMM registers on Haswell. So according to the awesome , awesome table put together by Agner Fog, I know that MUL can be done on either port p0 and p1 (with recp thruput of 0.5), while only ADD is done on only port p1 (with recp thruput of 1). I can except this limitation, BUT I also know that FMA can be done on either port p0 or p1 (with recp thruput of 0.5). So it is confusing to my as to why a plain ADD would be limited to only