lxml parser eats all memory

后端 未结 3 1953
有刺的猬
有刺的猬 2020-12-16 00:14

I\'m writing some spider in python and use lxml library for parsing html and gevent library for async. I found that after sometime of work lxml parser starts eats memory up

3条回答
  •  抹茶落季
    2020-12-16 00:59

    It seems the issue stems from the library lxml relies on: libxml2 which is written in C language. Here is the first report: http://codespeak.net/pipermail/lxml-dev/2010-December/005784.html This bug hasn't been mentioned either in lxml v2.3 bug fix logs or in libxml2 change logs.

    Oh, there is followup mails here: https://bugs.launchpad.net/lxml/+bug/728924

    Well, I tried to reproduce the issue, but get nothing abnormal. Guys who can reproduce it may help to clarify the problem.

提交回复
热议问题