Is it possible to make thread join to 'parallel for' region after its job?

旧街凉风 提交于 2019-12-10 13:34:10

问题


I have two jobs that need to run simultaneously at first:

1) for loop that can be parallelized

2) function that can be done with one thread

Now, let me describe what I want to do.

If there exist 8 available threads,

job(1) and job(2) have to run simultaneously at first with 7 threads and 1 thread, respectively.

After job(2) finishes, the thread that job(2) was using should be allocated to job(1) which is the parallel for loop.

I'm using omp_get_thread_num to count how many threads are active in each region. I would expect the the number of threads in job(1) increases by 1 when job(2) finishes.

Below describes a solution that might be wrong or ok:

  omp_set_nested(1);
  #pragma omp parallel
  {
    #pragma omp sections
    {
      #pragma omp section // job(2)
      { // 'printf' is not real job. It is just used for simplicity.
        printf("i'm single: %d\n", omp_get_thread_num());
      }
      #pragma omp section // job(1)
      {
        #pragma omp parallel for schedule(dynamic, 32)
        for (int i = 0 ; i < 10000000; ++i) {
          // 'printf' is not real job. It is just used for simplicity.
          printf("%d\n", omp_get_thread_num());
        }
      }
    }
  }

How can make the work that I want to achieve be done?


回答1:


What about something like this?

#pragma omp parallel
{
     // note the nowait here so that other threads jump directly to the for loop
    #pragma omp single nowait
    {
       job2();
    }

    #pragma omp for schedule(dynamic, 32)
    for (int i = 0 ; i < 10000000; ++i) {
        job1();
    }
}

I did not test this but the single will be executed by only one threads while all others will jump directly to the for loop thanks to the nowait. Also I think it is easier to read than with sections.




回答2:


The problem comes from synchronization. At the end of the section, omp waits for the termination of all threads and cannot release the thread on job 2 until its completion has been checked.

The solution requires to suppress the synchronization with a nowait.
I did not succeed to suppress synchronization with sections and nested parallelism. I rarely use nested parallel regions, but I think that, while sections can be nowaited, there is a problem when spawning the new nested parallel region inside a section. There is a mandatory synchronization at the end of a parallel section that cannot be suppressed and it probably prevents new threads to join the pool.

What I did is to use a single thread, without synchronization. This way, omp start the single thread and does not wait for its completion to start the parallel for. When the thread finishes its single work, it joins the thread pool to finish processing the for.

#include <omp.h>
#include <stdio.h>

int main() {
  int singlethreadid=-1;
  // omp_set_nested(1);
#pragma omp parallel
  {
#pragma omp single nowait  // job(2)
    { // 'printf' is not real job. It is just used for simplicity.
      printf("i'm single: %d\n", omp_get_thread_num());
      singlethreadid=omp_get_thread_num();
    }
#pragma omp for schedule(dynamic, 32) 
    for (int i = 0 ; i < 100000; ++i) {
      // 'printf' is not real job. It is just used for simplicity.
      printf("%d\n", omp_get_thread_num());
      if (omp_get_thread_num() == singlethreadid)
        printf("Hello, I\'m back\n");
    }
  }
}



回答3:


Another way (and potentially the better way) to express this would be to use OpenMP tasks:

#pragma omp parallel master
{
    #pragma omp task // job(2)
    { // 'printf' is not real job. It is just used for simplicity.
        printf("i'm single: %d\n", omp_get_thread_num());
    }
    #pragma omp taskloop // job(1)
    for (int i = 0 ; i < 10000000; ++i) {
        // 'printf' is not real job. It is just used for simplicity.
        printf("%d\n", omp_get_thread_num());
    }
}

If you have a compiler that does not understand OpenMP version 5.0, then you have to split the parallel and master:

#pragma omp parallel
#pragma omp master
{
    #pragma omp task // job(2)
    { // 'printf' is not real job. It is just used for simplicity.
        printf("i'm single: %d\n", omp_get_thread_num());
    }
    #pragma omp taskloop ]
    for (int i = 0 ; i < 10000000; ++i) {
        // 'printf' is not real job. It is just used for simplicity.
        printf("%d\n", omp_get_thread_num());
    }
}


来源:https://stackoverflow.com/questions/56573657/is-it-possible-to-make-thread-join-to-parallel-for-region-after-its-job

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!