split sql statements in php on semicolons (but not inside quotes)

爷,独闯天下 提交于 2019-12-04 09:28:33
zx81

(*SKIP)(*FAIL) Magic

This live PHP demo shows you the output of the two options below (with or without the semi-colon).

This is what you need:

$splits = preg_split('~\([^)]*\)(*SKIP)(*F)|;~', $sql);

See demo to see that we are splitting on the right semi-colons.

Output:

[0] => BEGIN
[1] => INSERT INTO TABLE_A (a, b, c) VALUES('42', '12', '\'ab\'c; DEF')
[2] => INSERT INTO TABLE_B (d, e, f) VALUES('42', '43', 'XY\'s Z ;uvw')
[3] => COMMIT
[4] =>

The empty item #4 is the match on the other side of the final ;. The other option is to keep the semi-colons (see below).

Option 2: Keep the Semi-Colons

If you want to keep the semi-colons, go with this:

$splits = preg_split('~\([^)]*\)(*SKIP)(*F)|(?<=;)(?![ ]*$)~', $sql);

Output:

[0] => BEGIN;
[1] => INSERT INTO TABLE_A (a, b, c) VALUES('42', '12', '\'ab\'c; DEF');
[2] => INSERT INTO TABLE_B (d, e, f) VALUES('42', '43', 'XY\'s Z ;uvw');
[3] => COMMIT;

Explanation

This problem is a classic case of the technique explained in this question to "regex-match a pattern, excluding..."

In left side of the alternation |, the regex \([^)]*\) matches complete (parentheses) then deliberately fails, after which the engine skips to the next position in the string. The right side matches the ; word you want, and we know they are the right ones because they were not matched by the expression on the left. It is now safe to split on it.

In Option 2, where we keep the semi-colons, our match on the right matches a position, but no characters. That position is asserted by the lookbehind (?<=;), which asserts that a ; immediately precedes the position, and the negative lookahead (?![ ]*$), which asserts that what follows is not optional spaces then the end of the string (so we avoid a last empty match).

Sample Code

Please examine the live PHP demo.

Reference

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!