Python Regex Sub with Multiple Patterns

北城余情 提交于 2021-01-28 00:31:19

问题


I'm trying to match multiple patterns using regex sub grouping and replace the match with an asterisk for a data file that has similar format to the string below. However, I am getting only the desired results for the first match. The subsequent matches are consuming string that I did not expect. Is there a better approach to getting the desired output below?

    import re
    myString = '-fruit apple -number    123 -animal  cat  -name     bob'

    match = re.compile('(-fruit\s+)(\w+)|'
                       '(-animal\s+)(cat)|'
                       '(-name\s+)(bob)')
    print(match.sub('\g<1>*', myString))

Current Output:

-fruit * -number    123 *  *

Desired Output:

-fruit * -number    123 -animal  *  -name     *

回答1:


Alternation does not reset the group numbers, thus your groups are numbered like (1)(2)|(3)(4)|(5)(6) but you do only reinsert group 1 - but should do so for groups 3 and 5 too. As non-matched groups are treated as empty string when replacing, you can simply add them to your pattern like \g<1>\g<2>\g<3>*.

On a sidenote I would recommend using raw strings when working with regex patterns (r'pattern'), so you do not have to wonder where to double backslash (e.g. \\b).



来源:https://stackoverflow.com/questions/51606032/python-regex-sub-with-multiple-patterns

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!