False sharing in OpenMP loop array access

旧时模样 提交于 2019-12-24 03:21:05

问题


I would like to take advantage of OpenMP to make my task parallel.

I need to subtract the same quantity to all the elements of an array and write the result in another vector. Both arrays are dynamically allocated with malloc and the first one is filled with values from a file. Each element is of type uint64_t.

#pragma omp parallel for
for (uint64_t i = 0; i < size; ++i) {
    new_vec[i] = vec[i] - shift;
}

Where shift is the fixed value I want to remove from every element of vec. size is the length of both vec and new_vec, which is approximately 200k.

I compile the code with g++ -fopenmp on Arch Linux. I'm on an Intel Core i7-6700HQ, and I use 8 threads. The running time is 5 to 6 times higher when I use the OpenMP version. I can see that all the cores are working when I run the OpenMP version.

I think this might be caused by a False Sharing issue, but I can't find it.


回答1:


You should adjust how the iterations are split among the threads. With schedule(static,chunk_size) you are able to do so.

Try to use chunk_size values multiples of 64/sizeof(uint64_t) to avoid the said false sharing:

[ cache line n   ][ cache line n+1 ]
[ chuhk 0  ][ chunk 1  ][ chunk 2  ]

And achieve something like this:

[ cache line n   ][ cache line n+1 ][ cache line n+2 ][...]
[ chunk 0                          ][ chunk 1             ]

You also should allocate your vectors in such a way that they are aligned to cache lines. That way you ensure that the first and subsequent chunks are properly aligned as well.

#define CACHE_LINE_SIZE sysconf(_SC_LEVEL1_DCACHE_LINESIZE) 
uint64_t *vec = aligned_alloc( CACHE_LINE_SIZE/*alignment*/, 200000 * sizeof(uint64_t)/*size*/);

Your problem is really similar to what Stream Triad benchmark represents. Check out how to optimize that benchmark and you will be able to map almost exactly the optimizations on your code.



来源:https://stackoverflow.com/questions/45032586/false-sharing-in-openmp-loop-array-access

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!