matching any character including newlines in a Python regex subexpression, not globally

前端 未结 3 414
难免孤独
难免孤独 2020-12-01 15:47

I want to use re.MULTILINE but NOT re.DOTALL, so that I can have a regex that includes both an \"any character\" wildcard and the normal . wild

3条回答
  •  不思量自难忘°
    2020-12-01 16:09

    To match a newline, or "any symbol" without re.S/re.DOTALL, you may use any of the following:

    [\s\S]
    [\w\W]
    [\d\D]
    

    The main idea is that the opposite shorthand classes inside a character class match any symbol there is in the input string.

    Comparing it to (.|\s) and other variations with alternation, the character class solution is much more efficient as it involves much less backtracking (when used with a * or + quantifier). Compare the small example: it takes (?:.|\n)+ 45 steps to complete, and it takes [\s\S]+ just 2 steps.

提交回复
热议问题