What's the best way to create a short hash, similar to what tiny Url does?

前端 未结 13 1288
被撕碎了的回忆
被撕碎了的回忆 2020-12-04 09:10

I\'m currently using MD5 hashes but I would like to find something that will create a shorter hash that uses just [a-z][A-Z][0-9]. It only needs to be around 5-

相关标签:
13条回答
  • 2020-12-04 09:44

    Is your goal to create a URL shortener or to create a hash function?

    If your goal is to create a URL shortener, then you don't need a hash function. In that case, you just want to pre generate a sequence of cryptographically secure random numbers, and then assign each url to be encoded a unique number from the sequence.

    You can do this using code like:

    using System.Security.Cryptography;
    
    const int numberOfNumbersNeeded = 100;
    const int numberOfBytesNeeded = 8;
    var randomGen = RandomNumberGenerator.Create();
    for (int i = 0; i < numberOfNumbersNeeded; ++i)
    {
         var bytes = new Byte[numberOfBytesNeeded];
         randomGen.GetBytes(bytes);
    }
    

    Using the cryptographic number generator will make it very difficult for people to predict the strings you generate, which I assume is important to you.

    You can then convert the 8 byte random number into a string using the chars in your alphabet. This is basically a change of base calculation (from base 256 to base 62).

    0 讨论(0)
  • 2020-12-04 09:49

    You can decrease the number of characters from the MD5 hash by encoding them as alphanumerics. Each MD5 character is usually represented as hex, so that's 16 possible values. [a-zA-Z0-9] includes 62 possible values, so you could encode each value by taking 4 MD5 values.

    EDIT:

    here's a function that takes a number ( 4 hex digits long ) and returns [0-9a-zA-Z]. This should give you an idea of how to implement it. Note that there may be some issues with the types; I didn't test this code.

    char num2char( unsigned int x ){
        if( x < 26 ) return (char)('a' + (int)x);
        if( x < 52 ) return (char)('A' + (int)x - 26);
        if( x < 62 ) return (char)('0' + (int)x - 52);
        if( x == 62 ) return '0';
        if( x == 63 ) return '1';
    }
    
    0 讨论(0)
  • 2020-12-04 09:51

    I dont think URL shortening services use hashes, I think they just have a running alphanumerical string that is increased with every new URL and stored in a database. If you really need to use a hash function have a look at this link: some hash functions Also, a bit offtopic but depending on what you are working on this might be interesting: Coding Horror article

    0 讨论(0)
  • 2020-12-04 09:52

    .NET string object has a GetHashCode() function. It returns an integer. Convert it into a hex and then to an 8 characters long string.

    Like so:

    string hashCode = String.Format("{0:X}", sourceString.GetHashCode());
    

    More on that: http://msdn.microsoft.com/en-us/library/system.string.gethashcode.aspx

    UPDATE: Added the remarks from the link above to this answer:

    The behavior of GetHashCode is dependent on its implementation, which might change from one version of the common language runtime to another. A reason why this might happen is to improve the performance of GetHashCode.

    If two string objects are equal, the GetHashCode method returns identical values. However, there is not a unique hash code value for each unique string value. Different strings can return the same hash code.

    Notes to Callers

    The value returned by GetHashCode is platform-dependent. It differs on the 32-bit and 64-bit versions of the .NET Framework.

    0 讨论(0)
  • 2020-12-04 09:54

    If you don't care about cryptographic strength, any of the CRC functions will do.

    Wikipedia lists a bunch of different hash functions, including length of output. Converting their output to [a-z][A-Z][0-9] is trivial.

    0 讨论(0)
  • 2020-12-04 09:55

    You can use CRC32, it is 8 bytes long and similar to MD5. Unique values will be supported by adding timestamp to actual value.

    So its will look like http://foo.bar/abcdefg12.

    0 讨论(0)
提交回复
热议问题