SimpleXML vs DOMDocument performance

落花浮王杯 提交于 2019-12-03 07:30:22

SimpleXML and DOMDocument both use the same parser (libxml2), so the parsing difference between them is negligible.

This is easy to verify:

function time_load_dd($xml, $reps) {
    // discard first run to prime caches
    for ($i=0; $i < 5; ++$i) { 
        $dom = new DOMDocument();
        $dom->loadXML($xml);
    }
    $start = microtime(true);
    for ($i=0; $i < $reps; ++$i) { 
        $dom = new DOMDocument();
        $dom->loadXML($xml);
    }
    $stop = microtime(true) - $start;
    return $stop;
}
function time_load_sxe($xml, $reps) {
    for ($i=0; $i < 5; ++$i) { 
        $sxe = simplexml_load_string($xml);
    }
    $start = microtime(true);
    for ($i=0; $i < $reps; ++$i) { 
        $sxe = simplexml_load_string($xml);
    }
    $stop = microtime(true) - $start;
    return $stop;
}


function main() {
    // This is a 1800-line atom feed of some complexity.
    $url = 'http://feeds.feedburner.com/reason/AllArticles';
    $xml = file_get_contents($url);
    $reps = 10000;
    $methods = array('time_load_dd','time_load_sxe');
    echo "Time to complete $reps reps:\n";
    foreach ($methods as $method) {
        echo $method,": ",$method($xml,$reps), "\n";
    }
}
main();

On my machine I get basically no difference:

Time to complete 10000 reps:
time_load_dd: 17.725028991699
time_load_sxe: 17.416455984116

The real issue here is what algorithms you are using and what you are doing with the data. 1000 lines is not a big XML document. Your slowdown will not be in memory usage or parsing speed but in your application logic.

Well, I have encountered a HUGE performance difference between DomDocument and SimpleXML. I have ~ 15 MB big XML file with approx 50 000 elements like this:

...
<ITEM>
  <Product>some product code</Product>
  <Param>123</Param>
  <TextValue>few words</TextValue>
</ITEM>
...

I only need to "read" those values and save them in PHP array. At first I tried DomDocument ...

$dom = new DOMDocument();
$dom->loadXML( $external_content );
$root = $dom->documentElement; 

$xml_param_values = $root->getElementsByTagName('ITEM');
foreach ($xml_param_values as $item) {
    $product_code = $item->getElementsByTagName('Product')->item(0)->textContent;
    // ... some other operation
}

That script died after 60 seconds with maximum execution time exceeded error. Only 15 000 items of 50k were parsed.

So I rewrote the code to SimpleXML version:

$xml = new SimpleXMLElement($external_content);
foreach($xml->xpath('ITEM') as $item) {
    $product_code = (string) $item->Product;
    // ... some other operation
}

After 1 second all was done.

I don't know how those functions are internally implemented in PHP, but in my application (and with my XML structure) there is really, REALLY HUGE performance difference between DomDocument and SimpleXML.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!