Using BeautifulSoup to grab all the HTML between two tags

前端 未结 4 1471
情深已故
情深已故 2020-12-25 12:57

I have some HTML that looks like this:

Title

//a random amount of p/uls or tagless text

Next Title

4条回答
  •  孤独总比滥情好
    2020-12-25 13:31

    Interesting question. There is no way you can use just DOM to select it. You'll have to loop trough all elements preceding the first h1 (including) and put them into intro = str(intro), then get everything up to the 2nd h1 into chapter1. Than remove the intro from the chapter1 using

    chapter = chapter1.replace(intro, '')
    

提交回复
热议问题