I got a problem when I try to compile a simple code there are C++ and Cuda code compile in a separated way.
Here\'s my code
main.cpp:
#includ
This question is pretty much a duplicate of this recent question.
Dynamic parallelism requires relocatable device code linking, in addition to compiling.
Your nvcc command line specifies a compile-only operation (-rdc=true -c).
g++ does not do any device code linking. So in a scenario like this, when doing the final link operation using g++ an extra device code link step is required.
Something like this:
nvcc -arch=sm_35 -rdc=true -c file.cu
nvcc -arch=sm_35 -dlink -o file_link.o file.o -lcudadevrt -lcudart
g++ file.o file_link.o main.cpp -L<path> -lcudart -lcudadevrt
When using CMake, setting CUDA_SEPARABLE_COMPILATION before find_package() enables both relocatable device code compiling and linking:
SET(CUDA_SEPARABLE_COMPILATION ON)
find_package(CUDA QUIET REQUIRED)