问题
I have two nested loops:
!$omp parallel
!$omp do
do i=1,4
...
!$omp parallel
!$omp do
do j=1,4
call job(i,j)
My computer can run four threads in parallel. For the outer loop such four threads are created. The first three finish quickly since for i=4
, the job
is four times more expensive.
Now I expect that in the inner parallel region, new threads share the work. But this doesn't happen: The CPU load stays at 1/4, just as if the 4th thread works serially on the inner loop.
How can I allocate parallel CPU time to the inner parallel loop?
回答1:
Did you try the following approach?
!$omp parallel do collapse(2)
do i = 1,4
do j = 1,4
call job(i,j)
end do
end do
It should behave better.
来源:https://stackoverflow.com/questions/39019299/idle-threads-while-new-threads-can-be-assigned-to-a-nested-loop