How to make word boundary \b not match on dashes

后端 未结 3 501
滥情空心
滥情空心 2021-01-05 04:05

I simplified my code to the specific problem I am having.

import re
pattern = re.compile(r\'\\bword\\b\')
result = pattern.sub(lambda x: \"match\", \"-word-          


        
3条回答
  •  忘掉有多难
    2021-01-05 04:35

    Instead of word boundaries, you could also match the character before and after the word with a (\s|^) and (\s|$) pattern.

    Breakdown: \s matches every whitespace character, which seems to be what you are trying to achieve, as you are excluding the dashes. The ^ and $ ensure that if the word is either the first or last in the string(ie. no character before or after) those are matched too.

    Your code would become something like this:

    pattern = re.compile(r'(\s|^)(word)(\s|$)')
    result = pattern.sub(r"\1match\3", "-word- word")
    

    Because this solution uses character classes such as \s, it means that those could be easily replaced or extended. For example if you wanted your words to be delimited by spaces or commas, your pattern would become something like this: r'(,|\s|^)(word)(,|\s|$)'.

提交回复
热议问题