PHP: Fastest way to handle undefined array key

后端 未结 8 1641
梦如初夏
梦如初夏 2020-12-08 07:13

in a very tight loop I need to access tenthousands of values in an array containing millions of elements. The key can be undefinied: In that case it shall be legal to return

相关标签:
8条回答
  • 2020-12-08 08:01

    I did some bench marking with the following code:

    set_time_limit(100);
    
    $count = 2500000;
    $search_index_end = $count * 1.5;
    $search_index_start = $count * .5;
    
    $array = array();
    for ($i = 0; $i < $count; $i++)
        $array[md5($i)] = $i;
    
    $start = microtime(true);
    for ($i = $search_index_start; $i < $search_index_end; $i++) {
        $key = md5($i);
        $test = isset($array[$key]) ? $array[$key] : null;
    }
    $end = microtime(true);
    echo ($end - $start) . " seconds<br/>";
    
    $start = microtime(true);
    for ($i = $search_index_start; $i < $search_index_end; $i++) {
        $key = md5($i);
        $test = array_key_exists($key, $array) ? $array[$key] : null;
    }
    $end = microtime(true);
    echo ($end - $start) . " seconds<br/>";
    
    
    $start = microtime(true);
    for ($i = $search_index_start; $i < $search_index_end; $i++) {
        $key = md5($i);
        $test = @$array[$key];
    }
    $end = microtime(true);
    echo ($end - $start) . " seconds<br/>";
    
    $error_reporting = error_reporting();
    error_reporting(0);
    $start = microtime(true);
    for ($i = $search_index_start; $i < $search_index_end; $i++) {
        $key = md5($i);
        $test = $array[$key];
    }
    $end = microtime(true);
    echo ($end - $start) . " seconds<br/>";
    error_reporting($error_reporting);
    
    $start = microtime(true);
    for ($i = $search_index_start; $i < $search_index_end; $i++) {
        $key = md5($i);
        $tmp = &$array[$key];
        $test = isset($tmp) ? $tmp : null;
    }
    $end = microtime(true);
    echo ($end - $start) . " seconds<br/>";
    

    and I found that the fastest running test was the one that uses isset($array[$key]) ? $array[$key] : null followed closely by the solution that just disables error reporting.

    0 讨论(0)
  • 2020-12-08 08:01

    First, re-organize the data for performance by saving a new array where the data is sorted by the keys, but the new array contains a regular numeric index.

    This part will be time consuming, but only done once.

     // first sort the array by it's keys
     ksort($data);
    
     // second create a new array with numeric index
     $tmp = new array();
     foreach($data as $key=>$value)
     {
        $tmp[] = array('key'=>$key,'value'=>$value);
     }
     // now save and use this data instead
     save_to_file($tmp);
    

    Once that is done it should be quick to find the key using a Binary Search. Later you can use a function like this.

      function findKey($key, $data, $start, $end)
      { 
        if($end < $start) 
        { 
            return null; 
        } 
    
        $mid = (int)(($end - $start) / 2) + $start; 
    
        if($data[$mid]['key'] > $key) 
        { 
            return findKey($key, $data, $start, $mid - 1); 
        } 
        else if($data[$mid]['key'] < $key) 
        { 
            return findKey($key, $data, $mid + 1, $end); 
        } 
    
        return $data[$mid]['value'];
     }
    

    To perform a search for a key you would do this.

     $result = findKey($key, $data, 0, count($data));
     if($result === null)
     {
          // key not found.
     }
    

    If the count($data) is done all the time, then you could cache that in the file that you stored the array data.

    I suspect this method will be a lot faster in performance then a regular linear search that is repeated against the $data. I can't promise it's faster. Only an octree would be quicker, but the time to build the octree might cancel out the search performance (I've experienced that before). It depends on how much searching in the data you have to do.

    0 讨论(0)
提交回复
热议问题