OpenMP performance

前端未结

关注

 3  852

伪装坚强ぢ 2020-12-13 14:33

Firstly, I know this [type of] question is frequently asked, so let me preface this by saying I\'ve read as much as I can, and I still don\'t know what the deal is.

3条回答

谎友^ (楼主)

2020-12-13 15:07

It's hard to know for sure what is happening without significant profiling, but the performance curve seems indicative of False Sharing...

threads use different objects but those objects happen to be close enough in memory that they fall on the same cache line, and the cache system treats them as a single lump that is effectively protected by a hardware write lock that only one core can hold at a time

Great article on the topic at Dr Dobbs

http://www.drdobbs.com/go-parallel/article/217500206?pgno=1

In particular the fact that the routines are doing a lot of malloc/free could lead to this.

One solution is to use a pool based memory allocator rather than the default allocator so that each thread tends to allocate memory from a different physical address range.

0 讨论(0)

查看其它3个回答
发布评论:

提交评论
- 加载中...