问题
Following this answer, I actually have more complicated code with three loops:
!$omp parallel
!$omp do
do i=1,4 ! can be parallelized
...
do k=1,1000 !to be executed sequentially
...
do j=1,4 ! can be parallelized
call job(i,j)
The outer loops finish quickly except for i=4
. So I want to start threads on the innermost loop, but leaving the k
-loop sequentially within each i
-iteration. In fact, k
loops over the changing states of a random number generator, so this cannot be parallelized.
How can I collapse only the i
and j
loops? I suspect the ordered
clause to be useful here, but I'm afraid that it would affect the inner loop again and still I'm unsure of the syntax.
回答1:
I can't imagine how that could work. Anyway, the collapse
syntax definitely does not support that.
If you have load balancing issue, think about reordering your loops, using dynamic scheduling, OpenMP tasks or nested parallelism. There is not enough code to tell which might be applicable here.
回答2:
If 1,4
is the real values you use in the outer loop, then I suggest to parallelize inner loops only (which can be parallelized), since there will be not much overhead.
Another suggestion is to swap k
and i
loops, if it's possible, so the outer loop would be loop in k
and the two new inner loops in i
and j
could be parallelized together using collapse.
回答3:
A lightweight and uniform approach for this case is to use OpenMP tasks.
You can use them for both parallel loops or just for the inner one. In the second case, we will have a combination of the for
and task
constructs. This solution exploits nested parallelism but avoids the implications of nested parallel regions. The taskloop
construct is an equivalent and more automated approach.
来源:https://stackoverflow.com/questions/39020762/is-it-possible-to-cross-collapse-parallel-loops