Python regex, matching pattern over multiple lines.. why isn't this working?

后端 未结 2 1906
渐次进展
渐次进展 2020-11-28 15:19

I know that for parsing I should ideally remove all spaces and linebreaks but I was just doing this as a quick fix for something I was trying and I can\'t figure out why its

相关标签:
2条回答
  • 2020-11-28 15:44

    Try re.findall(r"####(.*?)\s(.*?)\s####", string, re.DOTALL) (works with re.compile too, of course).

    This regexp will return tuples containing the number of the section and the section content.

    For your example, this will return [('1', 'ttteest'), ('2', ' \n\nttest')].

    (BTW: your example won't run, for multiline strings, use ''' or """)

    0 讨论(0)
  • 2020-11-28 15:50

    Multiline doesn't mean . will match line return, it means that ^ and $ are limited to lines only

    re.M re.MULTILINE

    When specified, the pattern character '^' matches at the beginning of the string and at the >beginning of each line (immediately following each newline); and the pattern character '$' >matches at the end of the string and at the end of each line (immediately preceding each >newline). By default, '^' matches only at the beginning of the string, and '$' only at the >end of the string and immediately before the newline (if any) at the end of the string.

    re.S or re.DOTALL makes . match even new lines.

    Source

    http://docs.python.org/

    0 讨论(0)
提交回复
热议问题