Dividing loop iterations among threads

后端 未结 8 591
广开言路
广开言路 2021-01-05 03:41

I recently wrote a small number-crunching program that basically loops over an N-dimensional grid and performs some calculation at each point.

for (int i1 =          


        
8条回答
  •  感情败类
    2021-01-05 04:21

    If you never coded a multithread application, I bare you to begin with OpenMP:

    • the library is now included in gcc by default
    • this is very easy to use

    In your example, you should just have to add this pragma:

    #pragma omp parallel shared(histogram)
    {
    for (int i1 = 0; i1 < N; i1++)
      for (int i2 = 0; i2 < N; i2++)
        for (int i3 = 0; i3 < N; i3++)
          for (int i4 = 0; i4 < N; i4++)
            histogram[bin_index(i1, i2, i3, i4)] += 1;
    }
    

    With this pragma, the compiler will add some instruction to create threads, launch them, add some mutexes around accesses to the histogram variable etc... There are a lot of options, but well defined pragma do all the work for you. Basically, the simplicity depends on the data dependency.

    Of course, the result should not be optimal as if you coded all by hand. But if you don't have load balancing problem, you maybe could approach a 2x speed up. Actually this is only write in matrix with no spacial dependency in it.

提交回复
热议问题