memcpy moving 128 bit in linux
问题 I'm writing a device driver in linux for a PCIe device. This device driver performs several read and write to test the throughput. When I use the memcpy, the maximum payload for a TLP is 8 bytes ( on 64 bits architectures ). In my opinion the only way to get a payload of 16 bytes is to use the SSE instruction set. I've already seen this but the code doesn't compile ( AT&T/Intel syntax issue ). There is a way to use that code inside linux ? Does anyone know where I can found an implementation