From an external source I\'m getting strings like
array(1,2,3)
but also a larger arrays like
array(\"a\", \"b\", \"c\", ar
Whilst writing a parser using the Tokenizer which turned out not as easy as I expected, I came up with another idea: Why not parse the array using eval, but first validate that it contains nothing harmful?
So, what the code does: It checks the tokens of the array against some allowed tokens and chars and then executes eval. I do hope I included all possible harmless tokens, if not, simply add them. (I intentionally didn't include HEREDOC and NOWDOC, because I think they are unlikely to be used.)
function parseArray($code) {
$allowedTokens = array(
T_ARRAY => true,
T_CONSTANT_ENCAPSED_STRING => true,
T_LNUMBER => true,
T_DNUMBER => true,
T_DOUBLE_ARROW => true,
T_WHITESPACE => true,
);
$allowedChars = array(
'(' => true,
')' => true,
',' => true,
);
$tokens = token_get_all('
I think this is a good comprimise between security and convenience - no need to parse yourself.
For example
parseArray('exec("haha -i -thought -i -was -smart")');
would throw exception:
Disallowed token 'T_STRING' encountered.