A fast hash function for string in C#

后端 未结 4 1468
后悔当初
后悔当初 2020-12-02 21:03

I want to hash a string of length up-to 30. What will be the best idea to do that if time is my concern. The function will be called over 100 million times. currently I am u

相关标签:
4条回答
  • 2020-12-02 21:18

    First of all, consider using GetHashCode().

    A simple improvement on your existing implementation:

    static UInt64 CalculateHash(string read, bool lowTolerance)
    {
        UInt64 hashedValue = 0;
        int i = 0;
        ulong multiplier = 1;
        while (i < read.Length)
        {
            hashedValue += read[i] * multiplier;
            multiplier *= 37;
            if (lowTolerance) i += 2;
            else i++;
        }
        return hashedValue;
    }
    

    It avoids the expensive floating point calculation, and the overhead of ElementAt.

    Btw (UInt64)Math.Pow(31, i) doesn't work well for longer strings. Floating point rounding will lead to a multiplier of 0 for characters beyond 15 or so.

    0 讨论(0)
  • 2020-12-02 21:28

    I have played with Paul Hsieh's implementations, and seem to be fast with little collisions (for my scenarios anyway)

    • http://www.azillionmonkeys.com/qed/hash.html
    0 讨论(0)
  • 2020-12-02 21:29
    static UInt64 CalculateHash(string read)
    {
        UInt64 hashedValue = 3074457345618258791ul;
        for(int i=0; i<read.Length; i++)
        {
            hashedValue += read[i];
            hashedValue *= 3074457345618258799ul;
        }
        return hashedValue;
    }
    

    This is a Knuth hash. You can also use Jenkins.

    0 讨论(0)
  • 2020-12-02 21:31

    To speed up your implementation, the (UInt64)Math.Pow(31, i) call should be replaced by a lookup: pre-calculate a table of the first 30 powers of 31, and use it at runtime. Since the limit on length is 30, you need only 31 element:

    private static unsigned long[] Pow31 = new unsigned long[31];
    
    static HashCalc() {
        Pow31[0] = 1;
        for (int i = 1 ; i != Pow31.Length ; i++) {
            Pow31[i] = 31*Pow31[i-1];
        }
    }
    
    // In your hash function...
    hashedValue += read.ElementAt(i) * Pow31[i];
    
    0 讨论(0)
提交回复
热议问题