Python 2.7 on Google App Engine, cannot use lxml.etree

匿名 (未验证) 提交于 2019-12-03 03:08:02

问题:

I've been trying to use html5lib with lxml on python 2.7 in google app engine. But when I run the following code, it gives me an error saying "NameError: global name 'etree' is not defined". Is it not possible to use lxml.etree on google app engine? or am I missing something?

app.yaml

application: testsite version: 1 runtime: python27 api_version: 1 threadsafe: false  handlers: - url: /.*   script: index.py     libraries: - name: lxml   version: "2.3"  # I thought this would allow me to use lxml.etree 

index.py

from testhandler import TestHandler application = webapp.WSGIApplication([('/', TestHandler)], debug=True) 

testhandler.py

import urllib2 import html5lib from html5lib import treebuilders try:     from lxml import etree     print("running with lxml.etree") except ImportError:     try:         # Python 2.5         import xml.etree.cElementTree as etree         print("running with cElementTree on Python 2.5+")     except ImportError:         try:             # Python 2.5             import xml.etree.ElementTree as etree             print("running with ElementTree on Python 2.5+")         except ImportError:             try:                 # normal cElementTree install                 import cElementTree as etree                 print("running with cElementTree")             except ImportError:                 try:                     # normal ElementTree install                     import elementtree.ElementTree as etree                     print("running with ElementTree")                 except ImportError:                     print("Failed to import ElementTree from any known place")  from google.appengine.ext import webapp  class TestHandler(webapp.RequestHandler):     def get(self):         f = urllib2.urlopen("http://www.yahoo.com/").read()         doc = html5lib.parse(f, treebuilder='lxml')         elems = doc.xpath("//*[local-name() = 'a']")         self.response.out.write(len(elems)) 

error

running with cElementTree on Python 2.5+ Status: 500 Internal Server Error Content-Type: text/html; charset=utf-8 Cache-Control: no-cache Expires: Fri, 01 Jan 1990 00:00:00 GMT Content-Length: 769  <pre>Traceback (most recent call last):   File &quot;/usr/local/bin/google_appengine/google/appengine/ext/webapp/_webapp25.py&quot;,     line 701, in __call__ handler.get(*groups)   File &quot;/home/test/testhandler.py&quot;, line 38, in get     parser = html5lib.HTMLParser(tree= treebuilders.getTreeBuilder('lxml'))   File &quot;/home/test/html5lib/html5parser.py&quot;, line 68, in __init__     self.tree = tree(namespaceHTMLElements)   File &quot;/home/test/html5lib/treebuilders/etree_lxml.py&quot;, line 176, in __init__     builder = etree_builders.getETreeModule(etree, fullTree=fullTree) NameError: global name 'etree' is not defined </pre> 

ADD

Nah, I tried several ways to create a doc object, but no luck. One of the ways, I tried to import from lxml.html import document_fromstring and that gives me this error.

Traceback (most recent call last):   File "/usr/local/bin/google_appengine/google/appengine/tools/dev_appserver.py", line 4143, in _HandleRequest     self._Dispatch(dispatcher, self.rfile, outfile, env_dict)   File "/usr/local/bin/google_appengine/google/appengine/tools/dev_appserver.py", line 4049, in _Dispatch     base_env_dict=env_dict)   File "/usr/local/bin/google_appengine/google/appengine/tools/dev_appserver.py", line 616, in Dispatch     base_env_dict=base_env_dict)   File "/usr/local/bin/google_appengine/google/appengine/tools/dev_appserver.py", line 3120, in Dispatch     self._module_dict)   File "/usr/local/bin/google_appengine/google/appengine/tools/dev_appserver.py", line 3024, in ExecuteCGI     reset_modules = exec_script(handler_path, cgi_path, hook)   File "/usr/local/bin/google_appengine/google/appengine/tools/dev_appserver.py", line 2887, in ExecuteOrImportScript     exec module_code in script_module.__dict__   File "/home/yoo/eclipse_workspace/website_checker/src/index.py", line 5, in <module>     from handlers.updatecheck import UpdateCheckHandler   File "/usr/local/bin/google_appengine/google/appengine/tools/dev_appserver.py", line 1538, in Decorate     return func(self, *args, **kwargs)   File "/usr/local/bin/google_appengine/google/appengine/tools/dev_appserver.py", line 2503, in load_module     return self.FindAndLoadModule(submodule, fullname, search_path)   File "/usr/local/bin/google_appengine/google/appengine/tools/dev_appserver.py", line 1538, in Decorate     return func(self, *args, **kwargs)   File "/usr/local/bin/google_appengine/google/appengine/tools/dev_appserver.py", line 2375, in FindAndLoadModule     description)   File "/usr/local/bin/google_appengine/google/appengine/tools/dev_appserver.py", line 1538, in Decorate     return func(self, *args, **kwargs)   File "/usr/local/bin/google_appengine/google/appengine/tools/dev_appserver.py", line 2318, in LoadModuleRestricted     description)   File "/home/test/updatecheck.py", line 4, in <module>     from lxml.html import document_fromstring   File "/usr/local/bin/google_appengine/google/appengine/tools/dev_appserver.py", line 1538, in Decorate     return func(self, *args, **kwargs)   File "/usr/local/bin/google_appengine/google/appengine/tools/dev_appserver.py", line 2503, in load_module     return self.FindAndLoadModule(submodule, fullname, search_path)   File "/usr/local/bin/google_appengine/google/appengine/tools/dev_appserver.py", line 1538, in Decorate     return func(self, *args, **kwargs)   File "/usr/local/bin/google_appengine/google/appengine/tools/dev_appserver.py", line 2375, in FindAndLoadModule     description)   File "/usr/local/bin/google_appengine/google/appengine/tools/dev_appserver.py", line 1538, in Decorate     return func(self, *args, **kwargs)   File "/usr/local/bin/google_appengine/google/appengine/tools/dev_appserver.py", line 2318, in LoadModuleRestricted     description)   File "/usr/lib/python2.7/dist-packages/lxml/html/__init__.py", line 12, in <module>     from lxml import etree ImportError: cannot import name etree 

According to the error, it seems app engine doesn't allow me to load etree module for some reason. I wanted to use xpath with lxml, but I can't spend much time to figure out what is going on here and don't have enough knowledge of python either. So I would give a try to find a way with 'simpletree' version.

f = urllib2.urlopen("http://www.yahoo.com/").read() p = html5lib.HTMLParser() doc = p.parse(f) # do something with doc.childNodes self.response.out.write(len(doc.childNodes))   

Not really a good way, but at least it worked when I tested on live google app engine.

回答1:

Have you installed lxml locally? I had the same error before - import failed. You can download lxml here: http://pypi.python.org/pypi/lxml/

lxml works with GAE and this is great. But it is a real absence of any documentation or examples about that right now.



回答2:

On Windows, I had this problem and it is due to the fact the python27 distro does not include the lxml. You can use the script easy_install but you will have to compile the source which gave me trouble.

Using this post I found on the Google forums:

https://groups.google.com/forum/?fromgroups=#!topic/comp.lang.python/Q8YeOIbn5Ds

However if you want to save yourself the pain trying to get it to build from source, just install a precompiled binary, for instance the one available from: http://www.lfd.uci.edu/~gohlke/pythonlibs/#lxml

Simply download the executable from the above web site and run the *.exe and it stalls all the code necessary.



回答3:

Try

import lxml

at the top of your testhandler



回答4:

install with pip : pip install lxml



标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!