问题
I have an ubuntu machine running pythong.2.7.6. When I try using lxml, which has been installed using pip, I get the following error:
Traceback (most recent call last):
File "./export.py", line 44, in fetch_item
root.append(elem)
File "lxml.etree.pyx", line 742, in lxml.etree._Element.append (src/lxml/lxml.etree.c:44339)
File "apihelpers.pxi", line 24, in lxml.etree._assertValidNode (src/lxml/lxml.etree.c:14127)
AssertionError: invalid Element proxy at 140443984439416
What does this mean, and how should I go about fixing this?
回答1:
I had the same issue in multiprocessing context. It can be illustrated by the following snippet:
from multiprocessing import Pool
import lxml.html
def process(html):
tree = lxml.html.fromstring(html)
body = tree.find('.//body')
print(body)
return body
def main():
pool = Pool()
result = pool.apply(process, ('<html><body/></html>',))
print(type(result))
print(result)
if __name__ == '__main__':
main()
The result of running it is the following output:
<Element body at 0x7f9f690461d8>
<class 'lxml.html.HtmlElement'>
Traceback (most recent call last):
File "test.py", line 18, in <module>
main()
File "test.py", line 14, in main
print(result)
File "src/lxml/lxml.etree.pyx", line 1142, in lxml.etree._Element.__repr__ (src/lxml/lxml.etree.c:54748)
File "src/lxml/lxml.etree.pyx", line 992, in lxml.etree._Element.tag.__get__ (src/lxml/lxml.etree.c:53182)
File "src/lxml/apihelpers.pxi", line 19, in lxml.etree._assertValidNode (src/lxml/lxml.etree.c:16856)
AssertionError: invalid Element proxy at 139697870845496
Thus most obvious explanation, taking into account that __repr__ works from the worker process and the return value is available to the calling process, is deserialisation issue. It can be solved, for example, by returning lxml.html.tostring(body), or any other pickle-able object.
来源:https://stackoverflow.com/questions/29570715/how-to-fix-lxml-assertion-error