Splitting a string that has escape sequence using regular expression in Java

后端 未结 2 1363
暗喜
暗喜 2020-12-10 13:29

String to be split

abc:def:ghi\\:klm:nop

String should be split based on \":\" \"\\\" is escape character. So \"\\:\" should not be treate

2条回答
  •  心在旅途
    2020-12-10 14:17

    Gumbo was right using a look-behind assertion, but in case your string contains the escaped escape character (e.g. \\) right in front of a comma, the split might break. See this example:

    test1\,test1,test2\\,test3\\\,test3\\\\,test4

    If you do a simple look-behind split for (? as Gumbo suggested, the string gets split into two parts only test1\,test1 and test2\\,test3\\\,test3\\\\,test4. This is because the look-behind just checks one character back for the escape character. What would actually be correct, if the string is split on commas and commas preceded by an even number of escape characters.

    To achieve this a slightly more complex (double) look-behind expression is needed:

    (?

    Using this more complex regular expression in Java, again requires to escape all \ by \\. So this should be a more sophisticated answer to your question:

    "any comma separated string".split("(?

    Note: Java does not support infinite repetitions inside of lookbehinds. Therefore only up to 10 repeating double escape characters are checked by using the expression {0,10}. If needed, you can increase this value by adjusting the latter number.

提交回复
热议问题