use memcpy for device arrays in openacc

ε祈祈猫儿з 提交于 2020-01-07 04:49:06

问题


Please, help. 1) I need to use memcpy for moving the arrays allocated on the gpu. i can not use std::memcpy because it "has no acc routine" (compiler output). My code is

const int GL=100000;
Particle particles[GL];
int cp01[2][GL];
#pragma acc declare create(particles,cp01)
...

i read that cudaMemcpy can be used with openacc. In function_device() (not able to fill the array allocated on the gpu) i call from the host

#pragma acc data copy(cp)
{
  cudaMemcpy(&particles[cp01[0][0]],&particles[cp01[1][0]],cp*sizeof(Particle),cudaMemcpyDeviceToDevice);
}

i use the header

#include <cuda_runtime.h>

for using CUDA. And build the project as

 cmake ../src -DCMAKE_CXX_COMPILER=pgc++ -DCMAKE_CXX_FLAGS="-acc -Minfo=all -Mcuda=llvm"

The program compiles, but does not work, it hangs with no output in the console line. How to move arrays allocated on the device (using cudaMemcpy or in some another manner)? Is that one include enough for using CUDA? Do i build the project correctly (using -Mcuda=llvm is necessary or not)? 2) i also have another question: if one writes

#pragma acc parallel loop
for(int i=0; i<N; ++i)
{...}

the variable N must be allocated on the host only or it may be also on the gpu?


回答1:


Since "cudaMemcpy" is a host side call where you want to pass in the device pointers, you'll want to use a "host_data" directive. No need to copy "cp" since you'll want to use the host value. Also make sure the host values of "cp01" are current.

Something like the following:

#pragma acc host_data use_device(particles) 
  { 
  cudaMemcpy(&particles[cp01[0][0]],&particles[cp01[1] [0]],cp*sizeof(Particle),cudaMemcpyDeviceToDevice); 
  }  


来源:https://stackoverflow.com/questions/47438995/use-memcpy-for-device-arrays-in-openacc

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!