Multiple threads and CPU cache

前端 未结 4 2078
清歌不尽
清歌不尽 2020-12-02 18:05

I am implementing an image filtering operation in C using multiple threads and making it as optimized as possible. I have one question though: If a memory is accessed by thr

4条回答
  •  独厮守ぢ
    2020-12-02 18:16

    In general it is a bad idea to share overlapping memory regions like if one thread processes 0,2,4... and the other processes 1,3,5... Although some architectures may support this, most architectures will not, and you probably can not specify on which machines your code will run on. Also the OS is free to assign your code to any core it likes (a single one, two on the same physical processor, or two cores on separate processors). Also each CPU usually has a separate first level cache, even if its on the same processor.

    In most situations 0,2,4.../1,3,5... will slow down performance extremely up to possibly being slower than a single CPU. Herb Sutters "Eliminate False Sharing" demonstrates this very well.

    Using the scheme [...n/2-1] and [n/2...n] will scale much better on most systems. It even may lead to super linear performance as the cache size of all CPUs in sum can be possibly used. The number of threads used should be always configurable and should default to the number of processor cores found.

提交回复
热议问题