I need split string by commas and spaces, but ignore the inside quotes, single quotes and parentheses
$str = \"Questions, \\\"Quote\\\",\'single quote\',\'co
Well, this works for the data you supplied:
$rgx = <<<'EOT'
/
[,\s]++
(?=(?:(?:[^"]*+"){2})*+[^"]*+$)
(?=(?:(?:[^']*+'){2})*+[^']*+$)
(?=(?:[^()]*+\([^()]*+\))*+[^()]*+$)
/x
EOT;
The lookaheads assert that if there are any double-quotes, single-quotes or parentheses ahead of the current match position there's an even number of them, and the parens are in balanced pairs (no nesting allowed). That's a quick-and-dirty way to ensure that the current match isn't occurring inside a pair of quotes or parens.
Of course, it assumes the input is well formed. But on the subject of of well-formedness, what about escaped quotes within quotes? What if you have quotes inside parens, or vice-versa? Would this input be legal?
"not a \" quote", 'not a ) quote', (not ",' quotes)
If so, you've got a much more difficult job ahead of you.
This will work only for non-nested parentheses:
$regex = <<<HERE
/ " ( (?:[^"\\\\]++|\\\\.)*+ ) \"
| ' ( (?:[^'\\\\]++|\\\\.)*+ ) \'
| \( ( [^)]* ) \)
| [\s,]+
/x
HERE;
$tags = preg_split($regex, $str, -1,
PREG_SPLIT_NO_EMPTY
| PREG_SPLIT_DELIM_CAPTURE);
The ++
and *+
will consume as much as they can and give nothing back for backtracking. This technique is described in perlre(1) as the most efficient way to do this kind of matching.