OpenMP declare SIMD for an inline function

。_饼干妹妹 提交于 2019-12-10 22:15:36

问题


The current OpenMP standard says about the declare simd directive for C/C++:

The use of a declare simd construct on a function enables the creation of SIMD versions of the associated function that can be used to process multiple arguments from a single invocation in a SIMD loop concurrently.

More details are given in the chapter, but there seems to be no restriction there to the type of function the directive can be applied to.

So my question is, can this directive be applied safely to an inline function?

I'm asking that for two reasons:

  1. An inline function is a rather unusual function, since it is normally inlined directly in the place it was called. So it is likely never compiled as a standalone function and therefore, the declare simd aspect of it is quite redundant with the possible simd directive at the enclosing loop's level.
  2. I have a code with such inline declare simd functions, and sometimes, for some nebulous reasons, GCC complains about their multiple definition at link time (with names mangled with extra characters suggesting that these are vectorised versions). But if I remove the declare simd directive, it compiles and link fine.

So far I hadn't think too much about it, but now I'm puzzled. Is that a bug of mine (ie using declare simd for inline functions) or is that a problem in GCC generating binary vectorised versions of inline functions and failing to sort them out at link time?


EDIT:
There is a GCC compiler options which makes a difference. When the inlining is enabled (with -O3 for example), the code compiles and links fine. But when compiled with -O0 or with -O3 -fno-inline, the inlining is disabled and the linking fails with this "multiple definition of" the function decorated with the omp declare simd directive.


EDIT 2:
Thanks to @Zboson questions regarding the compiler flags, I managed to create a reproducer. Here it is:

foobar.h:

#ifndef FOOBAR_H_
#define FOOBAR_H_

#include <cmath>

#pragma omp declare simd
inline double foo( double d ) {
    return sin( cos( exp( d ) ) );
}

double bar( double *v, int len );

#endif

foobar.cc:

#include "foobar.h"

double bar( double *v, int len ) {
    double sum = 0;
    for ( int i = 0; i < len; i++ ) {
        sum += foo( v[i] );
    }
    return sum;
}

simd.cc:

#include <iostream>
#include "foobar.h"

int main() {

    const int len = 100;
    double *v = new double[len];

    for ( int i = 0; i < len; i++ ) {
        v[i] = i;
    }

    double sum = 0;
    #pragma omp simd reduction( +: sum )
    for ( int i = 0; i < len; i++ ) {
        sum += foo( v[i] );
    }

    std::cout << sum << "  " << bar( v, len ) << std::endl;

    delete[] v;

    return 0;
}

compilation:

> g++ -fopenmp -g simd.cc foobar.cc
/tmp/ccI4e7ip.o: In function `_ZGVbN2v__Z3food':
foobar.h:7: multiple definition of `_ZGVbN2v__Z3food'
/tmp/cc4U8Qyu.o:foobar.h:7: first defined here
/tmp/ccI4e7ip.o: In function `_ZGVbM2v__Z3food':
foobar.h:7: multiple definition of `_ZGVbM2v__Z3food'
/tmp/cc4U8Qyu.o:foobar.h:7: first defined here
/tmp/ccI4e7ip.o: In function `_ZGVcN4v__Z3food':
foobar.h:7: multiple definition of `_ZGVcN4v__Z3food'
/tmp/cc4U8Qyu.o:foobar.h:7: first defined here
/tmp/ccI4e7ip.o: In function `_ZGVcM4v__Z3food':
foobar.h:7: multiple definition of `_ZGVcM4v__Z3food'
foobar.h:7: first defined here
/tmp/ccI4e7ip.o: In function `_ZGVdN4v__Z3food':
foobar.h:7: multiple definition of `_ZGVdN4v__Z3food'
foobar.h:7: first defined here
/tmp/ccI4e7ip.o: In function `_ZGVdM4v__Z3food':
foobar.h:7: multiple definition of `_ZGVdM4v__Z3food'
foobar.h:7: first defined here
collect2: error: ld returned 1 exit status
> c++filt _ZGVdM4v__Z3food
_ZGVdM4v__Z3food
> c++filt _Z3food
foo(double)

Gcc versions 4.9.2 and 5.1.0 both give the very same problem, while the Intel compiler version 15.0.3 compiles it just fine.


Final edit:
Hristo Iliev's comment and Z boson's question comfort me in the idea that my code is OpenMP compliant, and that this is a bug in GCC. I'll see to make further tests with the most up-to-date version I can find, and report it if needed.


回答1:


An inline function is a rather unusual function, since it is normally inlined directly in the place it was called. So it is likely never compiled as a standalone function.

This is incorrect. A function with or without inline unless declared static has external linkage. The compiler has to produce a stand-alone version of the function (which won't be inlined) in case the function is called from another object file. If you don't want a standalone function declare the function static. See section 8.3 und the heading "Inlined functions have a non-inlined copy" in Agner Fog's Optimizing software in C++ for more details.

Using static inline double foo does not give an error with your code.

Now let's look at the symbols. Without using static

nm foobar.o | grep foo

gives

W _Z3food
T _ZGVbM2v__Z3food
T _ZGVbN2v__Z3food
T _ZGVcM4v__Z3food
T _ZGVcN4v__Z3food
T _ZGVdM4v__Z3food
T _ZGVdN4v__Z3food

and nm foobar.o | grep foo gives the same thing.

The uppercase "W" and "T" mean the symbols are external. However "W" is a weak symbol which does not cause a link error however "T" is a strong symbol which does. So this shows why the linker is complaining.

What's the result with static inline? In this case nm foobar.o | grep foo gives

t _ZGVbM2v__ZL3food
t _ZGVbN2v__ZL3food
t _ZL3food

and nm simd.o | grep foo gives the same thing. But lowercase "t" means the symbols have local linkage and so there is no problem with the linker.

If we compile without OpenMP the only foo symbol produced is _ZL3food. I don't know why GCC is producing weak symbols for the non-SIMD version of the function and strong symbols for the SIMD version so I can't completely answer your question but I thought this information would be interesting nevertheless.



来源:https://stackoverflow.com/questions/34091341/openmp-declare-simd-for-an-inline-function

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!