问题
So in my code there are various function that alter various arrays and the order in which the functions are called is important. As all functions are called a big number of times creating and destroying the threads has become a big overhead. EDIT on my question as I may have oversimplified my current problem. An example
double ans = 0;
for (int i = 0; i < 4000; i++){
funcA(a,b,c);
funcB(a,b,c);
ans = funcC(a,b,c):
}
prinft(ans);
where funcA, funcB and func C are
void funcA (int* a, point b, int* c){
#pragma omp parallel for shared(a,b,c)
for (int ii = 0; ii < b.y; ii++){
for (int jj = 0; jj < b.x; jj++){
\\ alter values of a and c
}
}
}
void funcB (int* a, point b, int* c){
#pragma omp parallel for shared(a,b,c)
for (int ii = 0; ii < b.y; ii++){
for (int jj = 0; jj < b.x; jj++){
\\ alter values of a and c
}
}
}
double funcC (int* a, pointb, int* c){
double k = 0;
#pragma omp parallel for shared(a,b,c) reduction(+:k)
for (int ii = 0; ii < b.y; ii++){
for (int jj = 0; jj < b.x; jj++){
\\ alter values of a and c
k += sqrt(a[ii*jj] + c[ii**jj]);
}
}
return k;
}
Is there any way to create a team of threads before the initial for loop that are used by all functions and are not constantly being destroyed and created again, and still keep the correct order in the function calls?
EDIT 2:
What I am looking for is a way to run funcA funB, funcC in that order. But the functions have some code inside them that will use multiple threads. I want a way to create the threads at the start and then they will only be used on those parallel sections so the answer at the end is correct. Is there a way to avoid the forking and joining 40000 times?
回答1:
Assuming the rest of your code is correct, the following should just work the way you want it to:
#pragma omp parallel shared( a, b, c )
for (int i = 0; i < 4000; i++){
funcA(a,b,c);
funcB(a,b,c);
funcC(a,b,c):
}
With the different functions now defined as follows:
void funcA( int* a, point b, int* c ) {
#pragma omp for schedule( static )
for (int ii = 0; ii < b.y; ii++) {
for (int jj = 0; jj < b.x; jj++) {
\\ alter values of a and c
}
}
}
void funcB( int* a, point b, int* c ) {
#pragma omp for schedule( static )
for (int ii = 0; ii < b.y; ii++) {
for (int jj = 0; jj < b.x; jj++) {
\\ alter values of a and c
}
}
}
void funcC( int* a, point b, int* c ) {
#pragma omp for schedule( static )
for (int ii = 0; ii < b.y; ii++) {
for (int jj = 0; jj < b.x; jj++) {
\\ alter values of a and c
}
}
}
These OpenMP directives inside the functions are called orphaned directives as they appear in the code outside of any OpenMP parallel region. However, at run time, they will be used by any pre-existing OpenMP thread team the way you want them to.
In addition, I've put a schedule( static )
clause to each of the loops. This isn't necessary for code correctness, but this can improve performance by ensuring that each thread always deals with the same indexes across functions and calls...
来源:https://stackoverflow.com/questions/40248584/avoiding-overhead-in-thread-creation-openmp