pragma omp for inside pragma omp master or single

人盡茶涼 提交于 2019-12-10 09:43:57

问题


I'm sitting with some stuff here trying to make orphaning work, and reduce the overhead by reducing the calls of #pragma omp parallel. What I'm trying is something like:

#pragma omp parallel default(none) shared(mat,mat2,f,max_iter,tol,N,conv) private(diff,k)
{
#pragma omp master // I'm not against using #pragma omp single or whatever will work
{
while(diff>tol) {
    do_work(mat,mat2,f,N);
    swap(mat,mat2);
    if( !(k%100) ) // Only test stop criteria every 100 iteration
         diff = conv[k] = do_more_work(mat,mat2);
    k++;
} // end while
} // end master
} // end parallel

The do_work depends on the previous iteration so the while-loop is has to be run sequential. But I would like to be able to run the ´do_work´ parallel, so it would look something like:

void do_work(double *mat, double *mat2, double *f, int N)
{
int i,j;
double scale = 1/4.0;
#pragma omp for schedule(runtime) // Just so I can test different settings without having to recompile
for(i=0;i<N;i++)
    for(j=0;j<N;j++)
         mat[i*N+j] = scale*(mat2[(i+1)*N+j]+mat2[(i-1)*N+j] + ... + f[i*N+j]);
} 

I hope this can be accomplished some way, I'm just not sure how. So any help I can get is greatly appreciated (also if you're telling me this isn't possible). Btw I'm working with open mp 3.0, the gcc compiler and the sun studio compiler.


回答1:


The outer parallel region in your original code contains only a serial piece (#pragma omp master), which makes no sense and effectively results in purely serial execution (no parallelism). As do_work() depends on the previous iteration, but you want to run it in parallel, you must use synchronisation. The openmp tool for that is an (explicit or implicit) synchronisation barrier.

For example (code similar to yours):

#pragma omp parallel
for(int j=0; diff>tol; ++j)    // must be the same condition for each thread!
#pragma omp for                // note: implicit synchronisation after for loop
  for(int i=0; i<N; ++i)
    work(j,i);

Note that the implicit synchronisation ensures that no thread enters the next j if any thread is still working on the current j.

The alternative

for(int j=0; diff>tol; ++j)
#pragma omp parallel for
  for(int i=0; i<N; ++i)
    work(j,i);

should be less efficient, as it creates a new team of threads at each iteration, instead of merely synchronising.



来源:https://stackoverflow.com/questions/14384959/pragma-omp-for-inside-pragma-omp-master-or-single

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!