Hash function that produces short hashes?

后端 未结 10 1710
走了就别回头了
走了就别回头了 2020-12-07 23:58

Is there a way of encryption that can take a string of any length and produce a sub-10-character hash? I want to produce reasonably unique ID\'s but based on message content

10条回答
  •  悲哀的现实
    2020-12-08 00:18

    I needed something along the lines of a simple string reduction function recently. Basically, the code looked something like this (C/C++ code ahead):

    size_t ReduceString(char *Dest, size_t DestSize, const char *Src, size_t SrcSize, bool Normalize)
    {
        size_t x, x2 = 0, z = 0;
    
        memset(Dest, 0, DestSize);
    
        for (x = 0; x < SrcSize; x++)
        {
            Dest[x2] = (char)(((unsigned int)(unsigned char)Dest[x2]) * 37 + ((unsigned int)(unsigned char)Src[x]));
            x2++;
    
            if (x2 == DestSize - 1)
            {
                x2 = 0;
                z++;
            }
        }
    
        // Normalize the alphabet if it looped.
        if (z && Normalize)
        {
            unsigned char TempChr;
            y = (z > 1 ? DestSize - 1 : x2);
            for (x = 1; x < y; x++)
            {
                TempChr = ((unsigned char)Dest[x]) & 0x3F;
    
                if (TempChr < 10)  TempChr += '0';
                else if (TempChr < 36)  TempChr = TempChr - 10 + 'A';
                else if (TempChr < 62)  TempChr = TempChr - 36 + 'a';
                else if (TempChr == 62)  TempChr = '_';
                else  TempChr = '-';
    
                Dest[x] = (char)TempChr;
            }
        }
    
        return (SrcSize < DestSize ? SrcSize : DestSize);
    }
    

    It probably has more collisions than might be desired but it isn't intended for use as a cryptographic hash function. You might try various multipliers (i.e. change the 37 to another prime number) if you get too many collisions. One of the interesting features of this snippet is that when Src is shorter than Dest, Dest ends up with the input string as-is (0 * 37 + value = value). If you want something "readable" at the end of the process, Normalize will adjust the transformed bytes at the cost of increasing collisions.

    Source:

    https://github.com/cubiclesoft/cross-platform-cpp/blob/master/sync/sync_util.cpp

提交回复
热议问题