Improving memory layout for parallel computing

后端 未结 4 1244
迷失自我
迷失自我 2021-01-07 01:46

I\'m trying to optimize an algorithm (Lattice Boltzmann) for parallel computing using C++ AMP. And looking for some suggestions to optimize the memory layout, just found out

4条回答
  •  耶瑟儿~
    2021-01-07 02:27

    In general, you should make sure that data used on different cpus are not shared (easy) and are not on the same cache line (false sharing, see for example here: False Sharing is No Fun). Data used by the same cpu should be close together to benefit from caches.

提交回复
热议问题