Python high memory usage with BeautifulSoup

后端未结

关注

 4  1135

后悔当初 2020-12-19 03:24

I was trying to process several web pages with BeautifulSoup4 in python 2.7.3 but after every parse the memory usage goes up and up.

This simplified code produces th

4条回答

一整个雨季 (楼主)

2020-12-19 04:18

Try Beautiful Soup's decompose functionality, which destroys the tree, when you're done working with each file.

from bs4 import BeautifulSoup

def parse():
    f = open("index.html", "r")
    page = BeautifulSoup(f.read(), "lxml")
    # page extraction goes here
    page.decompose()
    f.close()

while True:
    parse()
    raw_input()

0 讨论(0)

查看其它4个回答