问题
I am very new to openMP, but am trying to write a simple program that generates the entries of matrix in parallel, namely for the N by M matrix A, let A(i,j) = i*j. A minimal example is included below:
#include <stdio.h>
#include <stdlib.h>
#include <omp.h>
int main(int argc,
char **argv)
{
int i, j, N, M;
N = 20;
M = 20;
int* A;
A = (int*) calloc(N*M, sizeof(int));
// compute entries of A in parallel
#pragma omp parallel for shared(A)
for (i = 0; i < N; ++i){
for (j = 0; j < M; ++j){
A[i*M + j] = i*j;
}
}
// print parallel results
for (i = 0; i < N; ++i){
for (j = 0; j < M; ++j){
printf("%d ", A[i*M + j]);
}
printf("\n");
}
free(A);
return 0;
}
The results are not always correct. In theory, I am only parallelizing the outer loop, and each iteration of the for loop does not modify the entries that the other iterations will modify. But I am not sure how to translate this to openMP. When doing a similar procedure for a vector array (i.e. just one for loop), there seems to be no issue, e.g.
#pragma omp parallel for
for (i = 0; i < N; ++i)
{
v[i] = i*i;
}
Can someone explain to me how to fix this?
回答1:
According to e.g. this
http://supercomputingblog.com/openmp/tutorial-parallel-for-loops-with-openmp/
The declaration of variables outside of a parallelized part is dangerous.
It can be defused by explicitly making the loop variable of the inner loop private.
For that, change this
#pragma omp parallel for shared(A)
to
#pragma omp parallel for private(j) shared(A)
回答2:
The issue in this case is that j
is shared between threads which messes with the control flow of the inner loop. By default variables declared outside of a parallel region are shared whereas variables declared inside of a parallel region are private.
Follow the general rule to declare variables as locally as possible. In the for loop this means:
#pragma omp parallel for
for (int i = 0; i < N; ++i) {
for (int j = 0; j < M; ++j) {
This makes reasoning about your code much easier - and OpenMP code mostly correct by default. (Note A
is shared by default because it is defined outside).
Alternatively you can manually specify private(i,j) shared(A)
- this is more explicit and can help beginners. However it creates redundancy and can also be dangerous: private
variables are uninitialized even if they had a valid value outside of the parallel region. Therefore I strongly recommend the implicit default approach unless necessary for advanced usage.
来源:https://stackoverflow.com/questions/50663749/computing-entries-of-a-matrix-in-openmp