Parallel Merge-Sort in OpenMP

♀尐吖头ヾ 提交于 2019-12-05 07:11:50

"I think the reason is that OpenMP cannot create parallel regions inside parallel regions"

You can have parallel region of parallel region.

OpenMP parallel regions can be nested inside each other. If nested parallelism is disabled, then the new team created by a thread encountering a parallel construct inside a parallel region consists only of the encountering thread. If nested parallelism is enabled, then the new team may consist of more than one thread (source).

In order to run your code correctly, you need to call omp_set_nested(1) and omp_set_num_threads(2).

Nested parallelism can be enabled or disabled by setting the OMP_NESTED environment variable or calling omp_set_nested() function

The modern answer to this question is to use tasks instead of sections. Tasks were added in OpenMP 3.0 (2009) and work better/easier than nested parallelism and sections, because nested parallelism can lead to oversubscription (more active threads than CPUs available), which causes significant performance degradation. With tasks, you have one team of threads matching the number of CPUs and the will work on the tasks. So you do not need the manual handling with the threads parameter. A simple solution looks like this:

// span parallel region outside once outside
void mergesort_omp(...) {
    #pragma omp parallel
    #pragma omp single
    mergesort_parallel_omp(...)
}


void mergesort_parallel_omp (int a[], int size, int temp[]) 
{  
    #pragma omp task
    mergesort_parallel_omp(a, size/2, temp);

    mergesort_parallel_omp(a + size/2, size - size/2, temp + size/2);

    #pragma omp taskwait
    merge(a, size, temp); 
}

However, it can still be problematic to create tasks for too small chunks of work, so it is useful to limit the parallelism based on the work granularity, e.g. as such:

void mergesort_parallel_omp (int a[], int size, int temp[]) 
{  
    if (size < size_threshold) {
        mergesort_serial(a, size, temp);
        return;
    }
    #pragma omp task
    mergesort_parallel_omp(a, size/2, temp);

    mergesort_parallel_omp(a + size/2, size - size/2, temp + size/2);

    #pragma omp taskwait
    merge(a, size, temp); 
}

Maybe I am totally missing the point here... but are you aware that you need to set the environment variable OMP_NUM_THREADS if you want to execute on more than 2 threads?

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!