Python high memory usage with BeautifulSoup

后端 未结 4 1169
后悔当初
后悔当初 2020-12-19 03:24

I was trying to process several web pages with BeautifulSoup4 in python 2.7.3 but after every parse the memory usage goes up and up.

This simplified code produces th

4条回答
  •  心在旅途
    2020-12-19 03:59

    Garbage collection is probably viable, but a context manager seems to handle it pretty well for me without any extra memory usage:

    from bs4 import BeautifulSoup as soup
    def parse():
      with open('testque.xml') as fh:
        page = soup(fh.read())
    

    Also, though not entirely necessary, if you're using raw_input to let it loop while you test I actually find this idiom quite useful:

    while not raw_input():
      parse()
    

    It'll continue to loop every time you hit enter, but as soon as you enter any non-empty string it'll stop for you.

提交回复
热议问题