问题
I am trying to offload code the GPU using OpenMP 4+ directives. I am using ubuntu 16.04 with GCC 7.2 and for general cases it is working fine. My problem comes when I am trying to offload a code that has a call to the sqrtf function that is defined in "math.h". The troubeling code is this:
#pragma omp target teams distribute \
map(to:posx[:n],posy[:n],posz[:n]) \
map(from:frcx[:n],frcy[:n],frcz[:n])
for (int i = 0; i < n; i++) {
frcx[i] = 0.0f;
frcy[i] = 0.0f;
frcz[i] = 0.0f;
for (int j = 0; j < n; j++) {
float dx = posx[j] - posx[i];
float dy = posy[j] - posy[i];
float dz = posz[j] - posz[i];
float distSqr = dx*dx + dy*dy + dz*dz + SOFTENING;
float invDist = 1.0f / sqrtf(distSqr);
float invDist3 = invDist * invDist * invDist;
frcx[i] += dx * invDist3;
frcy[i] += dy * invDist3;
frcz[i] += dz * invDist3;
}
}
When I try to compile it with:
$ gcc -Wall -O2 -march=native -mtune=native -fopenmp -o nbody_cpu_arrays_parallel_gpu common_funcs.c nbody_cpu_arrays_parallel_gpu.c -lm
unresolved symbol sqrtf
collect2: error: ld returned 1 exit status
mkoffload: fatal error: x86_64-linux-gnu-accel-nvptx-none-gcc-7 returned 1 exit status
compilation terminated.
lto-wrapper: fatal error: /usr/lib/gcc/x86_64-linux-gnu/7//accel/nvptx-none/mkoffload returned 1 exit status
compilation terminated.
/usr/bin/ld: error: lto-wrapper failed
collect2: error: ld returned 1 exit status
How can I make use of square root operations (or other mathematical functions) when offloading OMP code to GPUs?
回答1:
I encountered a similar issue. https://github.com/bisqwit/cpp_parallelization_examples/blob/master/README.md very helpfully describes the solution:
When offloading, you may get linker problems from math functions if you do an optimized build. To resolve, add -foffload=-lm -fno-fast-math -fno-associative-math
For reference, the errors I got with sqrt:
libgomp: Link error log ptxas application ptx input, line 138; error : Label expected for argument 0 of instruction 'call'
ptxas application ptx input, line 138; fatal : Call target not recognized
ptxas <macro util>, line 9; error : Illegal modifier '.div' for instruction 'mov'
ptxas fatal : Ptx assembly aborted due to errors
libgomp: cuLinkAddData (ptx_code) error: a PTX JIT compilation failed
libgomp: Cannot map target functions or variables (expected 2, have 4294967295)
And with sqrtf:
unresolved symbol sqrtf
collect2: error: ld returned 1 exit status
mkoffload: fatal error: x86_64-pc-linux-gnu-accel-nvptx-none-gcc returned 1 exit status
compilation terminated.
lto-wrapper: fatal error: gcc/x86_64-pc-linux-gnu/7.3.0//accel/nvptx-none/mkoffload returned 1 exit status
compilation terminated.
/usr/bin/ld: error: lto-wrapper failed
回答2:
clang 9.0 now has the feature that replace the standard math library function with equivelant version of ptx code (nvidia gpu ), which is not yet supported by gcc 9.0.
Compile and run: https://www.hahnjo.de/blog/2018/10/08/clang-7.0-openmp-offloading-nvidia.html
commit of clang : https://reviews.llvm.org/D61399
来源:https://stackoverflow.com/questions/49531448/openmp-gpu-offloading-math-library