How fast can you make linear search?

前端 未结 20 1887
死守一世寂寞
死守一世寂寞 2020-12-23 21:46

I\'m looking to optimize this linear search:

static int
linear (const int *arr, int n, int key)
{
        int i = 0;
        while (i < n) {
                      


        
20条回答
  •  再見小時候
    2020-12-23 22:25

    This answer is a little more obscure than my other one, so I'm posting it separately. It relies on the fact that C guarantees a boolean result false=0 and true=1. X86 can produce booleans without branching, so it might be faster, but I haven't tested it. Micro-optimizations like these will always be highly dependent on your processor and compiler.

    As before, the caller is responsible for putting a sentinel value at the end of the array to ensure that the loop terminates.

    Determining the optimum amount of loop unrolling takes some experimentation. You want to find the point of diminishing (or negative) returns. I'm going to take a SWAG and try 8 this time.

    static int
    linear (const int *arr, int n, int key)
    {
            assert(arr[n] >= key);
            int i = 0;
            while (arr[i] < key) {
                    i += (arr[i] < key);
                    i += (arr[i] < key);
                    i += (arr[i] < key);
                    i += (arr[i] < key);
                    i += (arr[i] < key);
                    i += (arr[i] < key);
                    i += (arr[i] < key);
                    i += (arr[i] < key);
           }
           return i;
    }
    

    Edit: As Mark points out, this function introduces a dependency in each line on the line preceding, which limits the ability of the processor pipeline to run operations in parallel. So lets try a small modification to the function to remove the dependency. Now the function does indeed require 8 sentinel elements at the end.

    static int 
    linear (const int *arr, int n, int key) 
    { 
            assert(arr[n] >= key);
            assert(arr[n+7] >= key);
            int i = 0; 
            while (arr[i] < key) {
                    int j = i;
                    i += (arr[j] < key); 
                    i += (arr[j+1] < key); 
                    i += (arr[j+2] < key); 
                    i += (arr[j+3] < key); 
                    i += (arr[j+4] < key); 
                    i += (arr[j+5] < key); 
                    i += (arr[j+6] < key); 
                    i += (arr[j+7] < key); 
           } 
           return i; 
    } 
    

提交回复
热议问题