Python's re.split() not removing all matched characters

前端 未结 1 719
爱一瞬间的悲伤
爱一瞬间的悲伤 2021-01-19 17:03

This is driving me absolutely nuts. I am positive that the entire date range at the start of the string is being matched by the regex. Yet, when I do re.split, an 8

相关标签:
1条回答
  • 2021-01-19 17:39

    From the docs for re.split:

    If capturing parentheses are used in pattern, then the text of all groups in the pattern are also returned as part of the resulting list.

    You do have a capturing group, and the last thing it matches is the character 8. That's why 8 is returned.

    You can use a non-capturing group instead:

    >>> b = r"(?:[0-9]|\/|-){21}"
               ^^ note these two characters added
    >>> re.split(b, a)
    ['', ' Lecture Wednesday 01:30PM - 02:45PM, Room to be Announced']
    

    Or you could put all the choices in a single character class, and not need a group at all:

    >>> b = r"[-/0-9]{21}"
    >>> re.split(b, a)
    ['', ' Lecture Wednesday 01:30PM - 02:45PM, Room to be Announced']
    
    0 讨论(0)
提交回复
热议问题