Can I safely use OpenMP with C++11?

后端未结

关注

 5  674

情深已故 2020-11-28 22:53

The OpenMP standard only considers C++ 98 (ISO/IEC 14882:1998). This means that there is no standard supporting usage of OpenMP under C++03 or even C++11. Thus, any program

5条回答

执笔经年 (楼主)

2020-11-28 23:22

OpenMP is often (I am aware of no exceptions) implemented on top of Pthreads, so you can reason about some of the interoperability questions by thinking about how C++11 concurrency interoperates with Pthread code.

I don't know if oversubscription due to the use of multiple threading models is an issue for you, but this is definitely an issue for OpenMP. There is a proposal to address this in OpenMP 5. Until then, how you solve this is implementation defined. They are heavy hammers, but you can use OMP_WAIT_POLICY (OpenMP 4.5+), KMP_BLOCKTIME (Intel and LLVM), and GOMP_SPINCOUNT (GCC) to address this. I'm sure other implementations have something similar.

One issue where interoperability is a real concern is w.r.t. the memory model, i.e. how atomic operations behave. This is currently undefined, but you can still reason about it. For example, if you use C++11 atomics with OpenMP parallelism, you should be fine, but you are responsible for using C++11 atomics correctly from OpenMP threads.

Mixing OpenMP atomics and C++11 atomics is a bad idea. We (the OpenMP language committee working group charged with looking at OpenMP 5 base language support) are currently trying to sort this out. Personally, I think C++11 atomics are better than OpenMP atomics in every way, so my recommendation is that you use C++11 (or C11, or __atomic) for your atomics and leave #pragma omp atomic for the Fortran programmers.

Below is an example code that uses C++11 atomics with OpenMP threads. It works as designed everywhere I have tested it.

Full disclosure: Like Jim and Mike, I work for Intel :-)

#if defined(__cplusplus) && (__cplusplus >= 201103L)

#include 
#include 

#include 

#include 

#ifdef _OPENMP
# include 
#else
# error No OpenMP support!
#endif

#ifdef SEQUENTIAL_CONSISTENCY
auto load_model  = std::memory_order_seq_cst;
auto store_model = std::memory_order_seq_cst;
#else
auto load_model  = std::memory_order_acquire;
auto store_model = std::memory_order_release;
#endif

int main(int argc, char * argv[])
{
    int nt = omp_get_max_threads();
#if 1
    if (nt != 2) omp_set_num_threads(2);
#else
    if (nt < 2)      omp_set_num_threads(2);
    if (nt % 2 != 0) omp_set_num_threads(nt-1);
#endif

    int iterations = (argc>1) ? atoi(argv[1]) : 1000000;

    std::cout << "thread ping-pong benchmark\n";
    std::cout << "num threads  = " << omp_get_max_threads() << "\n";
    std::cout << "iterations   = " << iterations << "\n";
#ifdef SEQUENTIAL_CONSISTENCY
    std::cout << "memory model = " << "seq_cst";
#else
    std::cout << "memory model = " << "acq-rel";
#endif
    std::cout << std::endl;

    std::atomic left_ready  = {-1};
    std::atomic right_ready = {-1};

    int left_payload  = 0;
    int right_payload = 0;

    #pragma omp parallel
    {
        int me      = omp_get_thread_num();
        /// 0=left 1=right
        bool parity = (me % 2 == 0);

        int junk = 0;

        /// START TIME
        #pragma omp barrier
        std::chrono::high_resolution_clock::time_point t0 = std::chrono::high_resolution_clock::now();

        for (int i=0; i dt = std::chrono::duration_cast>(t1-t0);
        #pragma omp critical
        {
            std::cout << "total time elapsed = " << dt.count() << "\n";
            std::cout << "time per iteration = " << dt.count()/iterations  << "\n";
            std::cout << junk << std::endl;
        }
    }

    return 0;
}

#else  // C++11
#error You need C++11 for this test!
#endif // C++11

0 讨论(0)

查看其它5个回答