可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试):
问题:
I have never really thought about this until today, but after searching the web I didn't really find anything. Maybe I wasn't wording it right in the search.
Given an array (of multiple dimensions or not):
$data = array('this' => array('is' => 'the'), 'challenge' => array('for' => array('you')));
When var_dumped:
array(2) { ["this"]=> array(1) { ["is"]=> string(3) "the" } ["challenge"]=> array(1) { ["for"]=> array(1) { [0]=> string(3) "you" } } }
The challenge is this: What is the best optimized method for recompiling the array to a useable array for PHP? Like an undump_var()
function. Whether the data is all on one line as output in a browser or whether it contains the line breaks as output to terminal.
Is it just a matter of regex? Or is there some other way? I am looking for creativity.
UPDATE: Note. I am familiar with serialize and unserialize folks. I am not looking for alternative solutions. This is a code challenge to see if it can be done in an optimized and creative way. So serialize and var_export are not solutions here. Nor are they the best answers.
回答1:
var_export
or serialize
is what you're looking for. var_export
will render a PHP parsable array syntax, and serialize
will render a non-human readable but reversible "array to string" conversion...
Edit Alright, for the challenge:
Basically, I convert the output into a serialized string (and then unserialize it). I don't claim this to be perfect, but it appears to work on some pretty complex structures that I've tried...
function unvar_dump($str) { if (strpos($str, "\n") === false) { //Add new lines: $regex = array( '#(\\[.*?\\]=>)#', '#(string\\(|int\\(|float\\(|array\\(|NULL|object\\(|})#', ); $str = preg_replace($regex, "\n\\1", $str); $str = trim($str); } $regex = array( '#^\\040*NULL\\040*$#m', '#^\\s*array\\((.*?)\\)\\s*{\\s*$#m', '#^\\s*string\\((.*?)\\)\\s*(.*?)$#m', '#^\\s*int\\((.*?)\\)\\s*$#m', '#^\\s*bool\\(true\\)\\s*$#m', '#^\\s*bool\\(false\\)\\s*$#m', '#^\\s*float\\((.*?)\\)\\s*$#m', '#^\\s*\[(\\d+)\\]\\s*=>\\s*$#m', '#\\s*?\\r?\\n\\s*#m', ); $replace = array( 'N', 'a:\\1:{', 's:\\1:\\2', 'i:\\1', 'b:1', 'b:0', 'd:\\1', 'i:\\1', ';' ); $serialized = preg_replace($regex, $replace, $str); $func = create_function( '$match', 'return "s:".strlen($match[1]).":\\"".$match[1]."\\"";' ); $serialized = preg_replace_callback( '#\\s*\\["(.*?)"\\]\\s*=>#', $func, $serialized ); $func = create_function( '$match', 'return "O:".strlen($match[1]).":\\"".$match[1]."\\":".$match[2].":{";' ); $serialized = preg_replace_callback( '#object\\((.*?)\\).*?\\((\\d+)\\)\\s*{\\s*;#', $func, $serialized ); $serialized = preg_replace( array('#};#', '#{;#'), array('}', '{'), $serialized ); return unserialize($serialized); }
I tested it on a complex structure such as:
array(4) { ["foo"]=> string(8) "Foo"bar"" [0]=> int(4) [5]=> float(43.2) ["af"]=> array(3) { [0]=> string(3) "123" [1]=> object(stdClass)#2 (2) { ["bar"]=> string(4) "bart" ["foo"]=> array(1) { [0]=> string(2) "re" } } [2]=> NULL } }
回答2:
There's no other way than manual parsing depending on the type. I didn't add support for objects, but it's very similar to the arrays one; you just need to do some reflection magic to populate not only public properties and to not trigger the constructor.
EDIT: Added support for objects... Reflection magic...
function unserializeDump($str, &$i = 0) { $strtok = substr($str, $i); switch ($type = strtok($strtok, "(")) { // get type, before first parenthesis case "bool": return strtok(")") === "true"?(bool) $i += 10:!$i += 11; case "int": $int = (int)substr($str, $i + 4); $i += strlen($int) + 5; return $int; case "string": $i += 11 + ($len = (int)substr($str, $i + 7)) + strlen($len); return substr($str, $i - $len - 1, $len); case "float": return (float)($float = strtok(")")) + !$i += strlen($float) + 7; case "NULL": return NULL; case "array": $array = array(); $len = (int)substr($str, $i + 6); $i = strpos($str, "\n", $i) - 1; for ($entries = 0; $entries \n ", $i)); } else { $key = (int)substr($str, $i + 1); $i += strlen($key); } $i += $indent + 5; // jump line $array[$key] = unserializeDump($str, $i); } $i = strpos($str, "}", $i) + 1; return $array; case "object": $reflection = new ReflectionClass(strtok(")")); $object = $reflection->newInstanceWithoutConstructor(); $len = !strtok("(") + strtok(")"); $i = strpos($str, "\n", $i) - 1; for ($entries = 0; $entries \n ", $i)?:INF, strpos($str, "\":protected]=>\n ", $i)?:INF, $priv = strpos($str, "\":\"", $i)?:INF)); if ($priv == $i) { $ref = new ReflectionClass(substr($str, $i + 3, - 3 - $i + $i = strpos($str, "\":private]=>\n ", $i))); $i += $indent + 13; // jump line } else { $i += $indent + ($str[$i+1] == ":"?15:5); // jump line $ref = $reflection; } $prop = $ref->getProperty($key); $prop->setAccessible(true); $prop->setValue($object, unserializeDump($str, $i)); } $i = strpos($str, "}", $i) + 1; return $object; } throw new Exception("Type not recognized...: $type"); }
(Here are a lot of "magic" numbers when incrementing string position counter $i
, mostly just string lengths of the keywords and some parenthesis etc.)
回答3:
If you want to encode/decode an array like this, you should either use var_export()
, which generates output in PHP's array for, for instance:
array( 1 => 'foo', 2 => 'bar' )
could be the result of it. You would have to use eval()
to get the array back, though, and that is a potentially dangerous way (especially since eval()
really executes PHP code, so a simple code injection could make hackers able to gain control over your PHP script).
Some even better solutions are serialize()
, which creates a serialized version of any array or object; and json_encode()
, which encodes any array or object with the JSON format (which is more preferred for data exchange between different languages).
回答4:
The trick is to match by chunks of code and "strings"
, and on strings do nothing but otherwise do the replacements:
$out = preg_replace_callback('/"[^"]*"|[^"]+/','repl',$in); function repl($m) { return $m[0][0]=='"'? str_replace('"',"'",$m[0]) : str_replace("(,","(", preg_replace("/(int\((\d+)\)|\s*|(string|)\(\d+\))/","\\2", strtr($m[0],"{}[]","(), ") ) ); }
outputs:
array('this'=>array('is'=>'the'),'challenge'=>array('for'=>array(0=>'you')))
(removing ascending numeric keys starting at 0 takes a little extra accounting, which can be done in the repl
function.)
ps. this doesn't solve the problem of strings containing "
, but as it seems that var_dump doesn't escape string contents, there is no way to solve that reliably. (you could match \["[^"]*"\]
but a string may contain "]
as well)
回答5:
Use regexp to change array(.) { (.*) } to array($1) and eval the code, this is not so easy as written because You have to deal with matching brackets etc., just a clue on how to find solution ;)
- this will be helpful if You cant change var_dump to var_export, or serialize
回答6:
I think you are looking for the serialize
function:
serialize ― Generates a storable representation of a value
It allows you to save the contents of array in readable format and later you can read the array back with unserialize
function.
Using these functions, you can store/retrieve the arrays even in text/flat files as well as database.