Prefetching data to cache for x86-64

寵の児 提交于 2019-12-18 12:16:58

问题


In my application, at one point I need to perform calculations on a large contiguous block of memory data (100s of MBs). What I was thinking was to keep prefetching the part of the block my program will touch in future, so that when I perform calculations on that portion, the data is already in the cache.

Can someone give me a simple example of how to achieve this with gcc? I read _mm_prefetch somewhere, but don't know how to properly use it. Also note that I have a multicore system, but each core will be working on a different region of memory in parallel.


回答1:


gcc uses builtin functions as an interface for lowlevel instructions. In particular for your case __builtin_prefetch. But you only should see a measurable difference when using this in cases where the access pattern is not easy to predict automatically.




回答2:


Modern CPUs have pretty good automatic prefetch and you may well find that you do more harm than good if you try to initiate software prefetching. There is most likely a lot more "low hanging fruit" that you can focus on for optimisation if you find that you actually have a performance problem. Prefetch tends to be one of the last things that you might try, when you're desperate for a few more percent throughput.



来源:https://stackoverflow.com/questions/10323420/prefetching-data-to-cache-for-x86-64

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!