Openmp nested loop
just playing around with openmp. Look at this code fragments: #pragma omp parallel { for( i =0;i<n;i++) { doing something } } and for( i =0;i<n;i++) { #pragma omp parallel { doing something } } Why is the first one a lot more slower (around the factor 5) than the second one? From theory I thought that the first one must be faster, because the parallel region is only created once and not n-times like the second? Can someone explain this to me? The code i want to parallelise has the following structure: for(i=0;i<n;i++) //wont be parallelizable { for(j=i+1;j<n;j++) //will be parallelized { doing