exploding a string using a regular expression

早过忘川 提交于 2019-12-02 11:04:24

问题


I have a string as below (the letters in the example could be numbers or texts and could be either uppercase or lowercase or both. If a value is a sentence, it should be between single quotations):

$string="a,b,c,(d,e,f),g,'h, i j.',k";

How can I explode that to get the following result?

Array([0]=>"a",[1]=>"b",[2]=>"c",[3]=>"(d,e,f)",[4]=>"g",[5]=>"'h,i j'",[6]=>"k")

I think using regular expressions will be a fast as well as clean solution. Any idea?

EDIT: This is what I have done so far, which is very slow for the strings having a long part between parenthesis:

$separator="*"; // whatever which is not used in the string
$Pattern="'[^,]([^']+),([^']+)[^,]'";
while(ereg($Pattern,$String,$Regs)){
    $String=ereg_replace($Pattern,"'\\1$separator\\2'",$String);
}

$Pattern="\(([^(^']+),([^)^']+)\)";
while(ereg($Pattern,$String,$Regs)){
    $String=ereg_replace($Pattern,"(\\1$separator\\2)",$String);
}

return $String;

This, will replace all the commas between the parenthesis. Then I can explode it by commas and the replace the $separator with the original comma.


回答1:


You can do the job using preg_match_all

$string="a,b,c,(d,e,f),g,'h, i j.',k";

preg_match_all('~\'[^\']++\'|\([^)]++\)|[^,]++~', $string,$result);
print_r($result[0]);

Explanation:

The trick is to match parenthesis before the ,

~          Pattern delimiter
'
[^']       All charaters but not a single quote
++         one or more time in [possessive][1] mode
'
|          or
\([^)]++\) the same with parenthesis
|          or
[^,]       All characters but not a comma
++
~

if you have more than one delimiter like quotes (that are the same for open and close), you can write your pattern like this, using a capture group:

$string="a,b,c,(d,e,f),g,'h, i j.',k,°l,m°,#o,p#,@q,r@,s";

preg_match_all('~([\'#@°]).*?\1|\([^)]++\)|[^,]++~', $string,$result);
print_r($result[0]);

explanation:

(['#@°])   one character in the class is captured in group 1
.*?        any character zero or more time in lazy mode 
\1         group 1 content

With nested parenthesis:

$string="a,b,(c,(d,(e),f),t),g,'h, i j.',k,°l,m°,#o,p#,@q,r@,s";

preg_match_all('~([\'#@°]).*?\1|(\((?>[^()]++|(?-1)?)*\))|[^,]++~', $string,$result);
print_r($result[0]);


来源:https://stackoverflow.com/questions/16476744/exploding-a-string-using-a-regular-expression

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!