binary search efficiency vs. linear search efficiency in fortran

前端 未结 1 1032
我在风中等你
我在风中等你 2020-12-10 06:10

This question is about the efficiency of a linear search vs. the efficiency of a binary search for a pre-sorted array in contiguous storage...

I have an application

1条回答
  •  春和景丽
    2020-12-10 06:49

    For small arrays, the problem is not cache. You are right: A small array is likely to be cached quickly.

    The problem is that branch prediction is likely to fail for binary search because branches are taken or skipped at random in a data-dependent way. Branch prediction misses stall the CPU pipeline.

    This effect can be severe. You can easily search 3 to 8 elements linearly in the same time it takes to do a single binary search branch (and you need to do multiple binary search branches). The exact break even point needs to be measured.

    Stalling the CPU pipeline is extremely expensive. A Core i7 can retire up to 4 instructions per clock cycle (12 giga-instructions per second at 3 GHz!). But only, if you are not stalling.

    There are branch-free algorithms doing binary search by using conditional-move CPU instructions. These algorithms basically unroll 32 search steps and use a CMOV in each step (32 steps are the theoretical maximum). They are branch-free but not stall free: Each next step depends 100% on the previous one so the CPU cannot charge ahead in the instruction stream. It has to wait all the time. So they don't solve this problem, only improve it slightly.

    0 讨论(0)
提交回复
热议问题