The prefetch instruction

后端 未结 3 775
心在旅途
心在旅途 2020-12-14 02:14

It appears the general logic for prefetch usage is that prefetch can be added, provided the code is busy in processing until the prefetch instruction completes its operation

3条回答
  •  一个人的身影
    2020-12-14 02:55

    To even consider prefetching code performance must already be an issue.

    1: use a code profiler. Trying to use prefetch without a profiler is a waste of time.

    2: whenever you find an instruction in a critical place that is anomalously slow, you have a candidate for a prefetch. Often the actual problem is on the memory access on the line before the slow one, rather than the slow one as indicated by the profiler. Work out what memory access is causing the problem (not always easy) and prefetch it.

    3 Run your profiler again and see if it made any difference. If it didn't take it out. On occasion I have sped up loops by >300% this way. It's generally most effective if you have a loop accessing memory in a non-sequential way.

    I Disagree completely about it being less useful on modern CPU's, I have found completely the opposite, though on older CPU's prefetching about 100 instructions was optimal, these day's I'd put that number more like 500.

提交回复
热议问题