Represent MD5 hash as an integer

浪尽此生 提交于 2019-12-03 01:49:10

Be careful. Converting the MD5s to an integer will require support for big (128-bit) integers. Chances are the API you're using will only support 32-bit integers - or worse, might be dealing with the number in floating-point. Either way, your ID will get munged. If this is the case, just assigning a second ID arbitrarily is a much better way to deal with things than trying to convert the MD5 into an integer.

However, if you are sure that the API can deal with arbitrarily large integers without trouble, you can just convert the MD5 from hexadecimal to an integer. PHP most likely does not support this built-in however, as it will try to represent it as either a 32-bit integer or a floating point; you'll probably need to use the PHP GMP library for it.

There are good reasons, stated by others, for doing it a different way.

But if what you want to do is convert an md5 hash into a string of decimal digits (which is what I think you really mean by "represent by an integer", since an md5 is already an integer in string form), and transform it back into the same md5 string:

function md5_hex_to_dec($hex_str)
{
    $arr = str_split($hex_str, 4);
    foreach ($arr as $grp) {
        $dec[] = str_pad(hexdec($grp), 5, '0', STR_PAD_LEFT);
    }
    return implode('', $dec);
}

function md5_dec_to_hex($dec_str)
{
    $arr = str_split($dec_str, 5);
    foreach ($arr as $grp) {
        $hex[] = str_pad(dechex($grp), 4, '0', STR_PAD_LEFT);
    }
    return implode('', $hex);
}

Demo:

$md5 = md5('example@example.com');
echo $md5 . '<br />';  // 23463b99b62a72f26ed677cc556c44e8
$dec = md5_hex_to_dec($md5);
echo $dec . '<br />';  // 0903015257466342942628374306682186817640
$hex = md5_dec_to_hex($dec);
echo $hex;             // 23463b99b62a72f26ed677cc556c44e8

Of course, you'd have to be careful using either string, like making sure to use them only as string type to avoid losing leading zeros, ensuring the strings are the correct lengths, etc.

For a 32-bit condensation, a simple solution could be made by selecting 4 hex pairs (8 characters) of the MD5 hash, where each pair represents one byte, and then converting that with intval().

For an unsigned 32-bit Int:

$inthash = intval(substr(md5($str), 0, 8), 16);

For the positive value only of a signed 32-bit Int:

$inthash = intval(substr(md5($str), 0, 8), 16) >> 1;

This will likely only work for values up to 64-bit (8 bytes or 16 characters) for most modern systems as noted in the docs.

On a system that can accommodate 64-bit Ints, a splitting strategy that consumes the entire 128-bit MD5 hash as 2 Ints might look like:

$hash = md5($str);
$inthash1 = intval(substr($hash, 0, 16), 16);
$inthash2 = intval(substr($hash, 16, 16), 16);

Why ord()? md5 produce normal 16-byte value, presented to you in hex for better readability. So you can't convert 16-byte value to 4 or 8 byte integer without loss. You must change some part of your algoritms to use this as id.

You could use hexdec to parse the hexadecimal string and store the number in the database.

Couldn't you just add another field that was an auto-increment int field?

what about:

$float = hexdec(md5('string'));

or

$int = (integer) (substr(hexdec(md5('string')),0,9)*100000000);

Definietly bigger chances for collision but still good enaugh to use instead of hash in DB though?

cheers,

/Marcin

Use the email address as the file name of a blank, temporary file in a shared folder, like /var/myprocess/example@example.org

Then, call ftok on the file name. ftok will return a unique, integer ID.

It won't be guaranteed to be unique though, but it will probably suffice for your API.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!