Using BeautifulSoup to grab all the HTML between two tags

前端 未结 4 1459
情深已故
情深已故 2020-12-25 12:57

I have some HTML that looks like this:

Title

//a random amount of p/uls or tagless text

Next Title

4条回答
  •  庸人自扰
    2020-12-25 13:30

    I have the same problem. Not sure if there is a better solution, but what I've done is use regular expressions to get the indices of the two nodes that I'm looking for. Once I have that, I extract the HTML between the two indexes and create a new BeautifulSoup object.

    Example:

    m = re.search(r'

    Title

    .*?

    ', html, re.DOTALL) s = m.start() e = m.end() - len('

    ') target_html = html[s:e] new_bs = BeautifulSoup(target_html)

提交回复
热议问题