Regex using word boundary but word ends with a . (period)

前端 未结 3 1635
走了就别回头了
走了就别回头了 2020-12-11 01:45

want to match word i.v. case insensitive

have pattern

(?i)\\bi\\.v\\.

but want a word boundary on the end
the abov

3条回答
  •  小蘑菇
    小蘑菇 (楼主)
    2020-12-11 02:26

    \b only matches between an alphanumeric character and a non-alphanumeric character (or the start/end of string). Therefore, it doesn't match after a ., unless an alphanumeric character immediately follows that dot.

    If your intent is to make sure that no non-whitespace character follows after the dot, then you can specify that using a negative lookahead assertion:

    (?i)\bi\.v\.(?!\S)
    

    (?!\S) means "Assert that the next character is not a non-whitespace character".

    This may sound a bit convoluted - why the double negative? Why not (?=\s) which means "Assert that the next character is a whitespace character"? Well, there is a subtle difference: The second version requires a whitespace character to be there; that means the regex would fail to match at the end of the string. The first regex handles that corner case as well.

    If you generally want the concept of "word boundary" to mean "space-delimited", then you need to replace the first \b as well:

    (?i)(?

    or the regex will match sam.i.v. which you don't seem to want it to.

提交回复
热议问题