Splitting on regex without removing delimiters

后端 未结 5 735
花落未央
花落未央 2020-12-11 20:15

So, I would like to split this text into sentences.

s = \"You! Are you Tom? I am Danny.\"

so I get:

[\"You!\", \"Are you To         


        
5条回答
  •  难免孤独
    2020-12-11 20:34

    If you prefer use split method rather than match, one solution split with group

    splitted = filter(None, re.split( r'(.*?[\.!\?])', s))
    

    Filter removes empty strings if any.

    This will work even if there is no spaces between sentences, or if you need catch trailing sentence that ends with a different punctuation sign, such as an unicode ellipses (or does have any at all)

    It even possible to keep you re as is (with escaping correction and adding parenthesis).

    splitted = filter(None, re.split( r'([\.!\?])', s))
    

    Then merge even and uneven elements and remove extra spaces

    Python split() without removing the delimiter

提交回复
热议问题