Python Regex can't find substring but it should

前端 未结 2 911
情深已故
情深已故 2021-01-25 20:49

I am trying to parse html using BeautifulSoup to try and extract the webpage title. Sometimes this does not work due to the website being badly written, such as Bad End tag. W

2条回答
  •  死守一世寂寞
    2021-01-25 21:01

    You should use the dotall flag to make the . match newline characters as well.

    result = re.search('\(.+?)\', html, re.DOTALL)
    

    As the documentation says:

    ...without this flag, '.' will match anything except a newline

提交回复
热议问题