Use of OpenMP chunk to break cache
问题 I've been trying to increase the performance of my OpenMP solution which often has to deal with nested loops on arrays. Although I've managed to bring it down to 37 from 59 seconds of the serial implementation (on an ageing dual-core Intel T6600) I'm worried that cache synch gets lots of CPU attention (when the CPU should be solving my problem!). I've been fighting to set up the profiler so I haven't verified that claim but my question stands regardless. According to this lecture on load