发表新帖

发表新帖

Using BeautifulSoup to grab all the HTML between two tags

前端未结

关注

 4  1475

情深已故 2020-12-25 12:57

I have some HTML that looks like this:

Title

//a random amount of p/uls or tagless text

 Next Title

4条回答

庸人自扰 (楼主)

2020-12-25 13:30
I have the same problem. Not sure if there is a better solution, but what I've done is use regular expressions to get the indices of the two nodes that I'm looking for. Once I have that, I extract the HTML between the two indexes and create a new BeautifulSoup object.

Example:
```
m = re.search(r'Title
.*?', html, re.DOTALL)
s = m.start()
e = m.end() - len('')
target_html = html[s:e]
new_bs = BeautifulSoup(target_html)
```
0 讨论(0)

查看其它4个回答
发布评论:

提交评论
- 加载中...

热议问题