Generating a 64-byte read PCIe TLP from an x86 CPU
可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试): 由 翻译 强力驱动 问题: When writing data to a PCIe device, it is possible to use a write-combining mapping to hint the CPU that it should generate 64-byte TLPs towards the device. Is it possible to do something similar for reads? Somehow hint the CPU to read an entire cache line or a larger buffer instead of reading one word at a time? 回答1: Intel has a white-paper on copying from video RAM to main memory ; this should be similar but a lot simpler (because the data fits in 2 or 4 vector registers). It says that NT loads will pull a whole cache-line of