Extracting contents from specific meta tags that are not closed using BeautifulSoup

前端 未结 6 1364
孤街浪徒
孤街浪徒 2020-12-28 09:34

I\'m trying to parse out content from specific meta tags. Here\'s the structure of the meta tags. The first two are closed with a backslash, but the rest don\'t have any clo

6条回答
  •  刺人心
    刺人心 (楼主)
    2020-12-28 10:06

    As suggested by ingo you could use a less strict parser like html5.

    soup3 = BeautifulSoup(page3, 'html5lib')
    

    but be sure to have python-html5lib parser available on the system.

提交回复
热议问题