问题
In every article about SimpleXML performance and memory usage it is mentioned that all parsed content is stored in memory and that processing large files will lead to large memory usage. But recently I found that processing large files with SimpleXML do not cause large memory usage even more it causes almost none memory usage. There is my test script:
<?php
error_reporting(E_ALL);
ini_set("display_errors", 1);
print "OS: " . php_uname() . "\n";
print "PHP version: " . phpversion() . "\n";
print round(memory_get_usage() / 1024 / 1024, 2) . " Mb\n";
$large_xml = '<?xml version="1.0" encoding="UTF-8"?><catalog><products>';
for ($i = 0; $i < 500000; $i++) {
$large_xml .= "<product><id>{$i}</id><name>Product Name {$i}</name><description>Some Description {$i}</description><price>{$i}</price></product>\n";
}
$large_xml .= "</products></catalog>";
print round(memory_get_usage() / 1024 / 1024, 2) . " Mb\n";
$products_sxml = simplexml_load_string($large_xml);
print round(memory_get_usage() / 1024 / 1024, 2) . " Mb\n";
?>
I was tesing this script on Linux server, PHP version: 5.3.8 and the output was:
OS: Linux 2.6.32-5-amd64 #1 SMP Mon Feb 25 00:26:11 UTC 2013 x86_64
PHP version: 5.3.8
0.6 Mb
65.98 Mb
65.98 Mb
So my question is - does anyone else has noticed it and what could be an explanation to this it, because I could not find anywhere in the web the explanaition of it - not even an confirmation about it?
回答1:
The memory management functionality of PHP is quite sophisticated, and accurately measuring the impact of a particular piece of high-level code is quite difficult. There was quite a good (very technical) talk on this by Julien Pauli at the PHP UK Conference, a video of which is available here.
There are a few possible reasons why memory_get_usage
might be lying to you:
- Firstly,
memory_get_usage
takes an optional parameter of$real_usage
, which distinguishes between the amount of memory allocated and the amount in use - the memory manager allocates memory a block at a time, so it will often have claimed more from the OS than is actually in use. As more is needed, the already-claimed memory is used up, meaning no more needs to be allocated. Testing in this case suggests that this is not relevant here. - More generally, there are different ways of allocating memory in the underlying C code that runs PHP. Since most of the work of SimpleXML is done not in the Zend Engine, but in a third-party library called libxml2, the memory allocation will be done there, not in the PHP-specific allocation routines which would be used when, say, appending to a PHP string.
I took the following function from Julien Pauli's slides, which looks at the Linux kernel's view of the running PHP process and finds the line which represents the "Resident Set Size" - the amount of physical memory which has actually been allocated, rather than the amount the process has asked to be reserved:
function heap() {
return shell_exec(sprintf('grep "VmRSS:" /proc/%s/status', getmypid()));
}
Adding a call to this (as well as to get_memory_usage(true)
) in your sample code, I got the following output, showing a significant allocation of "heap" memory when you parse the XML:
OS: Linux pink-marmalade 3.8.0-29-generic #42~precise1-Ubuntu SMP Wed Aug 14 16:19:23 UTC 2013 x86_64
PHP version: 5.3.10-1ubuntu3.8
memory_get_usage(): 0.61 Mb
memory_get_usage(true): 0.75 Mb
Heap: VmRSS: 6956 kB
memory_get_usage(): 65.99 Mb
memory_get_usage(true): 66.25 Mb
Heap: VmRSS: 74348 kB
memory_get_usage(): 65.99 Mb
memory_get_usage(true): 66.25 Mb
Heap: VmRSS: 761836 kB
回答2:
If I execute the script, I get exactly the same results.
One explanation could be that you don't use the XML Object, so the xml string isn't even parsed completely.
When you modify the script so that the data is sent to the browser print_r($products_sxml);
the memory usage is much higher after the call.
You should decrease the number of products in the xml obviously.
回答3:
SimpleXML stores the XML tree in an External Resource which is not included by the get_memory_usage function.
来源:https://stackoverflow.com/questions/18933365/php-simplexml-large-file-no-extra-memory-usage