Splitting on regex without removing delimiters

后端 未结 5 732
花落未央
花落未央 2020-12-11 20:15

So, I would like to split this text into sentences.

s = \"You! Are you Tom? I am Danny.\"

so I get:

[\"You!\", \"Are you To         


        
5条回答
  •  青春惊慌失措
    2020-12-11 20:48

    If Python supported split by zero-length matches, you could achieve this by matching an empty string preceded by one of the delimiters:

    (?<=[.!?])
    

    Demo: https://regex101.com/r/ZLDXr1/1

    Unfortunately, Python does not support split by zero-length matches. Yet the solution may still be useful in other languages that support lookbehinds.

    However, based on you input/output data samples, you rather need to split by spaces preceded by one of the delimiters. So the regex would be:

    (?<=[.!?])\s+
    

    Demo: https://regex101.com/r/ZLDXr1/2

    Python demo: https://ideone.com/z6nZi5

    If the spaces are optional, the re.findall solution suggested by @Psidom is the best one, I believe.

提交回复
热议问题