PHP best way to MD5 multi-dimensional array?

后端 未结 13 1999
别那么骄傲
别那么骄傲 2020-12-07 11:48

What is the best way to generate an MD5 (or any other hash) of a multi-dimensional array?

I could easily write a loop which would traverse through each level of the

相关标签:
13条回答
  • 2020-12-07 12:33

    there are several answers telling to use json_code,

    but json_encode don't work fine with iso-8859-1 string, as soon as there is a special char, the string is cropped.

    i would advice to use var_export :

    md5(var_export($array, true))
    

    not as slow as serialize, not as bugged as json_encode

    0 讨论(0)
  • 2020-12-07 12:33

    Currently the most up-voted answer md5(serialize($array)); doesn't work well with objects.

    Consider code:

     $a = array(new \stdClass());
     $b = array(new \stdClass());
    

    Even though arrays are different (they contain different objects), they have same hash when using md5(serialize($array));. So your hash is useless!

    To avoid that problem, you can replace objects with result of spl_object_hash() before serializing. You also should do it recursively if your array has multiple levels.

    Code below also sorts arrays by keys, as dotancohen have suggested.

    function replaceObjectsWithHashes(array $array)
    {
        foreach ($array as &$value) {
            if (is_array($value)) {
                $value = $this->replaceObjectsInArrayWithHashes($value);
            } elseif (is_object($value)) {
                $value = spl_object_hash($value);
            }
        }
        ksort($array);
        return $array;
    }
    

    Now you can use md5(serialize(replaceObjectsWithHashes($array))).

    (Note that the array in PHP is value type. So replaceObjectsWithHashes function DO NOT change original array.)

    0 讨论(0)
  • 2020-12-07 12:37
    // Convert nested arrays to a simple array
    $array = array();
    array_walk_recursive($input, function ($a) use (&$array) {
        $array[] = $a;
    });
    
    sort($array);
    
    $hash = md5(json_encode($array));
    
    ----
    
    These arrays have the same hash:
    $arr1 = array(0 => array(1, 2, 3), 1, 2);
    $arr2 = array(0 => array(1, 3, 2), 1, 2);
    
    0 讨论(0)
  • 2020-12-07 12:39

    I'm joining a very crowded party by answering, but there is an important consideration that none of the extant answers address. The value of json_encode() and serialize() both depend upon the order of elements in the array!

    Here are the results of not sorting and sorting the arrays, on two arrays with identical values but added in a different order (code at bottom of post):

        serialize()
    1c4f1064ab79e4722f41ab5a8141b210
    1ad0f2c7e690c8e3cd5c34f7c9b8573a
    
        json_encode()
    db7178ba34f9271bfca3a05c5ffffdf502
    c9661c0852c2bd0e26ef7951b4ca9e6f
    
        Sorted serialize()
    1c4f1064ab79e4722f41ab5a8141b210
    1c4f1064ab79e4722f41ab5a8141b210
    
        Sorted json_encode()
    db7178ba34f9271bfca3a05c5ffffdf502
    db7178ba34f9271bfca3a05c5ffffdf502
    

    Therefore, the two methods that I would recommend to hash an array would be:

    // You will need to write your own deep_ksort(), or see
    // my example below
    
    md5(   serialize(deep_ksort($array)) );
    
    md5( json_encode(deep_ksort($array)) );
    

    The choice of json_encode() or serialize() should be determined by testing on the type of data that you are using. By my own testing on purely textual and numerical data, if the code is not running a tight loop thousands of times then the difference is not even worth benchmarking. I personally use json_encode() for that type of data.

    Here is the code used to generate the sorting test above:

    $a = array();
    $a['aa'] = array( 'aaa'=>'AAA', 'bbb'=>'ooo', 'qqq'=>'fff',);
    $a['bb'] = array( 'aaa'=>'BBBB', 'iii'=>'dd',);
    
    $b = array();
    $b['aa'] = array( 'aaa'=>'AAA', 'qqq'=>'fff', 'bbb'=>'ooo',);
    $b['bb'] = array( 'iii'=>'dd', 'aaa'=>'BBBB',);
    
    echo "    serialize()\n";
    echo md5(serialize($a))."\n";
    echo md5(serialize($b))."\n";
    
    echo "\n    json_encode()\n";
    echo md5(json_encode($a))."\n";
    echo md5(json_encode($b))."\n";
    
    
    
    $a = deep_ksort($a);
    $b = deep_ksort($b);
    
    echo "\n    Sorted serialize()\n";
    echo md5(serialize($a))."\n";
    echo md5(serialize($b))."\n";
    
    echo "\n    Sorted json_encode()\n";
    echo md5(json_encode($a))."\n";
    echo md5(json_encode($b))."\n";
    

    My quick deep_ksort() implementation, fits this case but check it before using on your own projects:

    /*
    * Sort an array by keys, and additionall sort its array values by keys
    *
    * Does not try to sort an object, but does iterate its properties to
    * sort arrays in properties
    */
    function deep_ksort($input)
    {
        if ( !is_object($input) && !is_array($input) ) {
            return $input;
        }
    
        foreach ( $input as $k=>$v ) {
            if ( is_object($v) || is_array($v) ) {
                $input[$k] = deep_ksort($v);
            }
        }
    
        if ( is_array($input) ) {
            ksort($input);
        }
    
        // Do not sort objects
    
        return $input;
    }
    
    0 讨论(0)
  • 2020-12-07 12:41
    md5(serialize($array));
    
    0 讨论(0)
  • 2020-12-07 12:44

    I didn't see the solution so easily above so I wanted to contribute a simpler answer. For me, I was getting the same key until I used ksort (key sort):

    Sorted first with Ksort, then performed sha1 on a json_encode:

    ksort($array)
    $hash = sha1(json_encode($array) //be mindful of UTF8
    

    example:

    $arr1 = array( 'dealer' => '100', 'direction' => 'ASC', 'dist' => '500', 'limit' => '1', 'zip' => '10601');
    ksort($arr1);
    
    $arr2 = array( 'direction' => 'ASC', 'limit' => '1', 'zip' => '10601', 'dealer' => '100', 'dist' => '5000');
    ksort($arr2);
    
    var_dump(sha1(json_encode($arr1)));
    var_dump(sha1(json_encode($arr2)));
    

    Output of altered arrays and hashes:

    string(40) "502c2cbfbe62e47eb0fe96306ecb2e6c7e6d014c"
    string(40) "b3319c58edadab3513832ceeb5d68bfce2fb3983"
    
    0 讨论(0)
提交回复
热议问题