发表新帖

发表新帖

binary search efficiency vs. linear search efficiency in fortran

前端未结

关注

 1  1032

我在风中等你 2020-12-10 06:10

This question is about the efficiency of a linear search vs. the efficiency of a binary search for a pre-sorted array in contiguous storage...

I have an application

1条回答

春和景丽 (楼主)

2020-12-10 06:49

For small arrays, the problem is not cache. You are right: A small array is likely to be cached quickly.

The problem is that branch prediction is likely to fail for binary search because branches are taken or skipped at random in a data-dependent way. Branch prediction misses stall the CPU pipeline.

This effect can be severe. You can easily search 3 to 8 elements linearly in the same time it takes to do a single binary search branch (and you need to do multiple binary search branches). The exact break even point needs to be measured.

Stalling the CPU pipeline is extremely expensive. A Core i7 can retire up to 4 instructions per clock cycle (12 giga-instructions per second at 3 GHz!). But only, if you are not stalling.

There are branch-free algorithms doing binary search by using conditional-move CPU instructions. These algorithms basically unroll 32 search steps and use a CMOV in each step (32 steps are the theoretical maximum). They are branch-free but not stall free: Each next step depends 100% on the previous one so the CPU cannot charge ahead in the instruction stream. It has to wait all the time. So they don't solve this problem, only improve it slightly.

0 讨论(0)
发布评论:

提交评论
- 加载中...

热议问题