How do I generate a hashcode from a byte array in C#?

前端 未结 11 1601
故里飘歌
故里飘歌 2020-11-27 14:21

Say I have an object that stores a byte array and I want to be able to efficiently generate a hashcode for it. I\'ve used the cryptographic hash functions for this in the pa

11条回答
  •  悲哀的现实
    2020-11-27 14:54

    I found interesting results:

    I have the class:

    public class MyHash : IEquatable
    {        
        public byte[] Val { get; private set; }
    
        public MyHash(byte[] val)
        {
            Val = val;
        }
    
        /// 
        /// Test if this Class is equal to another class
        /// 
        /// 
        /// 
        public bool Equals(MyHash other)
        {
            if (other.Val.Length == this.Val.Length)
            {
                for (var i = 0; i < this.Val.Length; i++)
                {
                    if (other.Val[i] != this.Val[i])
                    {
                        return false;
                    }
                }
    
                return true;
            }
            else
            {
                return false;
            }            
        }
    
        public override int GetHashCode()
        {            
            var str = Convert.ToBase64String(Val);
            return str.GetHashCode();          
        }
    }
    

    Then I created a dictionary with keys of type MyHash in order to test how fast I can insert and I can also know how many collisions there are. I did the following

            // dictionary we use to check for collisions
            Dictionary checkForDuplicatesDic = new Dictionary();
    
            // used to generate random arrays
            Random rand = new Random();
    
    
    
            var now = DateTime.Now;
    
            for (var j = 0; j < 100; j++)
            {
                for (var i = 0; i < 5000; i++)
                {
                    // create new array and populate it with random bytes
                    byte[] randBytes = new byte[byte.MaxValue];
                    rand.NextBytes(randBytes);
    
                    MyHash h = new MyHash(randBytes);
    
                    if (checkForDuplicatesDic.ContainsKey(h))
                    {
                        Console.WriteLine("Duplicate");
                    }
                    else
                    {
                        checkForDuplicatesDic[h] = true;
                    }
                }
                Console.WriteLine(j);
                checkForDuplicatesDic.Clear(); // clear dictionary every 5000 iterations
            }
    
            var elapsed = DateTime.Now - now;
    
            Console.Read();
    

    Every time I insert a new item to the dictionary the dictionary will calculate the hash of that object. So you can tell what method is most efficient by placing several answers found in here in the method public override int GetHashCode() The method that was by far the fastest and had the least number of collisions was:

        public override int GetHashCode()
        {            
            var str = Convert.ToBase64String(Val);
            return str.GetHashCode();          
        }
    

    that took 2 seconds to execute. The method

        public override int GetHashCode()
        {
            // 7.1 seconds
            unchecked
            {
                const int p = 16777619;
                int hash = (int)2166136261;
    
                for (int i = 0; i < Val.Length; i++)
                    hash = (hash ^ Val[i]) * p;
    
                hash += hash << 13;
                hash ^= hash >> 7;
                hash += hash << 3;
                hash ^= hash >> 17;
                hash += hash << 5;
                return hash;
            }
        }
    

    had no collisions also but it took 7 seconds to execute!

提交回复
热议问题