Remove all javascript tags and style tags from html with python and the lxml module

后端未结

关注

 4  2214

南笙 2020-12-23 12:11

I am parsing an html document using the http://lxml.de/ library. So far I have figured out how to strip tags from an html document In lxml, how do I remove a tag but retain

4条回答

夕颜 (楼主)

2020-12-23 12:40
You can use bs4 libray also for this purpose.
```
soup = BeautifulSoup(html_src, "lxml")
[x.extract() for x in soup.findAll(['script', 'style'])]
```
0 讨论(0)

查看其它4个回答
发布评论:

提交评论
- 加载中...