问题
I am a newbie in CUDA programming and in the process of re-writing a C code into a parallelized CUDA new code.
Is there a way to write output data files directly from the device without bothering copying arrays from device to host? I assume if cuPrintf exists, there must be away to write a cuFprintf?
Sorry, if the answer has already been given in a previous topic, I can't seem to find it...
Thanks!
回答1:
The short answer is, no there is not.
cuPrintf and the built-in printf support in Fermi and Kepler runtime is implemented using device to host copies. The mechanism is no different to using cudaMemcpy to transfer a buffer to the host yourself.
Just about all CUDA compatible GPUs support so-called zero-copy (AKA "pinned, mapped") memory, which allows the GPU to map a host buffer into its address space and execute DMA transfers into that mapped host memory. Note, however, that setup and initialisation of mapped memory has considerably higher overhead than conventional memory allocation (so you really need a lot of transactions to amortise that overhead throughout the life of your application), and that the CUDA driver can't use zero-copy with any other than addresses backed by physical memory. So you can't mmap a file and use zero-copy on it, ie. you will still need explicit host side file IO code to get from a zero-copy buffer to disk.
来源:https://stackoverflow.com/questions/21303713/writing-output-files-from-cuda-devices