CUDA - how much slower is transferring over PCI-E?

十年热恋 提交于 2019-11-27 18:52:33

问题


If I transfer a single byte from a CUDA kernel to PCI-E to the host (zero-copy memory), how much is it slow compared to transferring something like 200 Megabytes?

What I would like to know, since I know that transferring over PCI-E is slow for a CUDA kernel, is: does it change anything if I transfer just a single byte or a huge amount of data? Or perhaps since memory transfers are performed in "bulks", transferring a single byte is extremely expensive and useless with respect to transferring 200 MBs?


回答1:


Hope this pic explain everything. The data is generated by bandwidthTest in CUDA samples. The hardware environment is PCI-E v2.0, Tesla M2090 and 2x Xeon E5-2609. Please note both axises are in log scale.

Given this figure, we can see that the overhead of launching a transfer request takes a constant time. Regression analysis on the data gives an estimated overhead time of 4.9us for H2D, 3.3us for D2H and 3.0us for D2D.



来源:https://stackoverflow.com/questions/17729351/cuda-how-much-slower-is-transferring-over-pci-e

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!