mmap() device memory into user space

怎甘沉沦 提交于 2019-12-11 09:02:45

问题


Saying if we do a mmap() system call and maps some PCIE device memory (like GPU) into the user space, then application can access those memory region in the device without any OS overhead. Data can by copied from file system buffer directly to device memory without any other copy.

Above statement must be wrong... Can anyone tell me where is the flaw? Thanks!


回答1:


For a normal device what you have said is correct. If the GPU memory behaves differently for reads/write, they might do this. We should look at some documentation of cudaMemcpy().

From Nvidia's basics of CUDA page 22,

direction specifies locations (host or device) of src and dst Blocks CPU thread: returns after the copy is complete. Doesn't start copying until previous CUDA calls complete

It seems pretty clear that the cudaMemcpy() is synchronized to prior GPU registers writes, which may have caused the mmap() memory to be updated. As the GPU pipeline is a pipeline, prior command issues may not have completed when cudaMemcpy() is issued from the CPU.



来源:https://stackoverflow.com/questions/20298147/mmap-device-memory-into-user-space

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!