How to re-install lxml?

狂风中的少年 提交于 2019-11-28 09:06:33
osa

I am using BeautifulSoup 4.3.2 and OS X 10.6.8. I also have a problem with improperly installed lxml. Here are some things that I found out:

First of all, check this related question: Removed MacPorts, now Python is broken

Now, in order to check which builders for BeautifulSoup 4 are installed, try

>>> import bs4
>>> bs4.builder.builder_registry.builders

If you don't see your favorite builder, then it is not installed, and you will see an error as above ("Couldn't find a tree builder...").

Also, just because you can import lxml, doesn't mean that everything is perfect.

Try

>>> import lxml
>>> import lxml.etree

To understand what's going on, go to the bs4 installation and open the egg (tar -xvzf). Notice the modules bs4.builder. Inside it you should see files such as _lxml.py and _html5lib.py. So you can also try

>>> import bs4.builder.htmlparser
>>> import bs4.builder._lxml
>>> import bs4.builder._html5lib

If there is a problem, you will see, why a parricular module cannot be loaded. You can notice how at the end of builder/__init__.py it loads all those modules and ignores whatever was not loaded:

# Builders are registered in reverse order of priority, so that custom
# builder registrations will take precedence. In general, we want lxml
# to take precedence over html5lib, because it's faster. And we only
# want to use HTMLParser as a last result.
from . import _htmlparser
register_treebuilders_from(_htmlparser)
try:
    from . import _html5lib
    register_treebuilders_from(_html5lib)
except ImportError:
    # They don't have html5lib installed.
    pass
try:
    from . import _lxml
    register_treebuilders_from(_lxml)
except ImportError:
    # They don't have lxml installed.
    pass

If you are using Python2.7 in Ubuntu/Debian, this worked for me:

$ sudo apt-get build-dep python-lxml
$ sudo pip install lxml 

Test it like:

mona@pascal:~/computer_vision/image_retrieval$ python
Python 2.7.6 (default, Jun 22 2015, 17:58:13) 
[GCC 4.8.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import lxml

FWIW, I ran into a similar problem (python 3.6, os x 10.12.6) and was able to solve it simply by doing (first command is just to signify that I was working in a conda virtualenv):

$ source activate ml-general
$ pip uninstall lxml
$ pip install lxml

I tried more complicated things first, because BeautifulSoup was working correctly with an identical command through Jupyter+iPython, but not through PyCharm's terminal in the same virtualenv. Simply reinstalling lxml as above solved the problem.

apt-get on Debian/Ubuntu: sudo apt-get install python3-lxml For MacOS-X, a macport of lxml is available. Try something like sudo port install py27-lxml

http://lxml.de/installation.html may be helpful.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!