Regular Expressions, understanding lookbehind in combination with the or operator

牧云@^-^@ 提交于 2019-12-01 12:07:17

You are correct,

(?<=\"[0-9]|\"[0-9]{2}|\"[0-9]{3})(,)(?=[0-9]{2}\")

should be the right regex in this case.


About why you "don't need the \" for two and three digits" - you actually need it.
(?<=\"[0-9]|[0-9]{2}|[0-9]{3})(,)(?=[0-9]{2}\")

Will match 12,23" and 123,23" as well.


EDIT: Looks like the problem is that Sublime doesn't allow for variable length of lookbehind even if they are listed with |. Meaning (?<=\"[0-9]|\"[0-9]{2}|\"[0-9]{3}) will fail, because the alternatives are not of the same size - 2, 3, 4.

This is because Sublime seems to be using the Boost library regexes. There it is stated:

Lookbehind

(?<=pattern) consumes zero characters, only if pattern could be matched against the characters preceding the current position (pattern must be of fixed length).

(?<!pattern) consumes zero characters, only if pattern could not be matched against the characters preceding the current position (pattern must be of fixed length).

An alternative is to separate the lookbehinds:

(?:(?<=\"[0-9])|(?<=\"[0-9]{2})|(?<=\"[0-9]{3}))(,)(?=[0-9]{2}\")


What can you do if you don't want to list all possible lengths?

There is a cool trick which is present in some regex engines (including Perl's, Ruby's and Sublime's) - \K. What \K roughly translates to is "drop all that was matched so far". Therefore, you can match any , within a float number surrounded by quotation marks with:

"\d+\K,(?=\d+")

See it in action

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!