omp parallel for loop (reduction to find max) ran slower than serial codes

这一生的挚爱 提交于 2019-12-12 04:08:44

问题


I am new in using OpenMP. I think that use max reduction clause to find the max element of an array is not such a bad idea, but in fact the parallel for loop ran much slower than serial one.

int main() {
double sta, end, elapse_t;
int bsize = 46000;
int q = bsize;
int max_val = 0;
double *buffer;
buffer = (double*)malloc(bsize*sizeof(double));
srand(time(NULL));
for(int i=0;i<q;i++)
    buffer[i] = rand()%10000;

sta = omp_get_wtime();
//int i;
#pragma omp parallel for reduction(max : max_val)
for(int i=0;i<q; i++)
{
    max_val = max_val > buffer[i] ? max_val : buffer[i];
}
end = omp_get_wtime();
printf("parallel maximum time %f\n", end-sta);

sta = omp_get_wtime();
for(int i=0;i<q; i++)
{
    max_val = max_val > buffer[i] ? max_val : buffer[i];
}
end = omp_get_wtime();
printf("serial maximum time   %f\n", end-sta);

free(buffer); 
return 0;}

Compile command

gcc-7 kp_omp.cpp -o kp_omp -fopenmp

Execution results

./kp_omp 
parallel maximum time 0.000505
serial maximum time   0.000266

As for the CPU, it is an Intel Core i7-6700 with 8 cores.


回答1:


Whenever you parallelise a loop openMP needs to perform some operations, for example creating the threads. Those operations result in some overhead and this in turns implies that, for each loop, there is a minimum number of iterations under which it is not convenient to parallelise.

If I execute your code I obtain the same results you have:

./kp_omp
parallel maximum time 0.000570
serial maximum time   0.000253

However if I modify bsize in line 8 such that

int bsize = 100000;

I obtain

./kp_omp
parallel maximum time 0.000323
serial maximum time   0.000552

So the parallelised version got faster than the sequential. Part of the challenges you encounter when you try to speedup the execution of a code is to understand when it is convenient to parallelise and when the overhead of the parallelisation would kill your expected gain in performance.



来源:https://stackoverflow.com/questions/46527671/omp-parallel-for-loop-reduction-to-find-max-ran-slower-than-serial-codes

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!