lxml | 易学教程

Move an entire element in with lxml.etree

阅读更多关于 Move an entire element in with lxml.etree

问题 Within lxml, is it possible, given an element, to move the entire thing elsewhere in the xml document without having to read all of it's children and recreate it? My best example would be changing parents. I've rummaged around the docs a bit but haven't had much luck. Thanks in advance! 回答1: .append , .insert and other operations do that by default >>> from lxml import etree >>> tree = etree.XML('<a><b><c/></b><d><e><f/></e></d></a>') >>> node_b = tree.xpath('/a/b')[0] >>> node_d = tree.xpath

LXML: Cannot import etree

阅读更多关于 LXML: Cannot import etree

问题 I went to this page and downloaded the tar file : http://pypi.python.org/pypi/lxml/2.3.4#downloads I then copied the lxml folder to my Python26/Lib folder. Now, when i go to the interpreter and type from lxml import etree i get the error: cannot import etree . Does someone know what is going wrong? I am running windows. 回答1: Simply unzipping the archive and moving it is not how one should install Python packages. You usually run python setup.py install from the folder; but on Windows there

What is the deal about https when using lxml?

阅读更多关于 What is the deal about https when using lxml?

问题 I am using lxml to parse html files given urls. For example: link = 'https://abc.com/def' htmltree = lxml.html.parse(link) My code is working well for most of the cases, the ones with http:// . However, I found for every https:// url, lxml simply gets an IOError . Does anyone know the reason? And possibly, how to correct this problem? BTW, I want to stick to lxml than switch to BeautifulSoup given I've already got a quick finished programme. 回答1: I don't know what's happening, but I get the

Python: Injecting HTML content into a tag using `lxml.html`

阅读更多关于 Python: Injecting HTML content into a tag using `lxml.html`

问题 I'm using the lxml.html library to parse an HTML document. I located a specific tag, that I call content_tag , and I want to change its content (i.e. the text between <div> and </div> ,) and the new content is a string with some html in it, say it's 'Hello <b>world!</b>' . How do I do that? I tried content_tag.text = 'Hello <b>world!</b>' but then it escapes all the html tags, replacing < with < etc. I want to inject the text without escaping any HTML. How can I do that? 回答1: This is one way:

Pip is already installed: but I am getting no module named lxml

阅读更多关于 Pip is already installed: but I am getting no module named lxml

问题 I have installed lxml with pip. But when I run a script that uses lxml, I get "no module named lxml." Why might this be? How do I fix it? (venv)prompt$ sudo pip install lxml Requirement already satisfied (use --upgrade to upgrade): lxml in /usr/local/lib/python2.7/dist-packages Cleaning up... (venv)prompt$ python scripts/pyftp.py Traceback (most recent call last): File "scripts/pyftp.py", line 5, in <module> from lxml import etree ImportError: No module named lxml 回答1: It looks like you are

How to iterate over GraphML file with lxml

阅读更多关于 How to iterate over GraphML file with lxml

问题 I have the following GraphML file 'mygraph.gml' that I want to parse with a simple python script: This represents a simple graph with 2 nodes "node0", "node1" and an edge between them <?xml version="1.0" encoding="UTF-8"?> <graphml xmlns="http://graphml.graphdrawing.org/xmlns" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://graphml.graphdrawing.org/xmlns http://graphml.graphdrawing.org/xmlns/1.0/graphml.xsd"> <key id="name" for="node" attr.name="name" attr

Can I parse xpath using python, selenium and lxml?

阅读更多关于 Can I parse xpath using python, selenium and lxml?

问题 So I have been trying to figure our how to use BeautifulSoup and did a quick search and found lxml can parse the xpath of an html page. I would LOVE if I could do that but the tutorial isnt that intuitive. I know how to use Firebug to grab the xpath and was curious if anyone has use lxml and can explain how I can use it to parse specific xpath's, and print them.. say 5 per line..or if it's even possible?! Selenium is using Chrome and loads the page properly, just need help moving forward.

upper case html tags encoded in lxml

阅读更多关于 upper case html tags encoded in lxml

问题 I am parsing an html file using lxml.html....The html file contains tags with small case letters and also large case letters. A part of my code is shown below: response = urllib2.urlopen(link) html = response.read().decode('cp1251') content_html = etree.HTML(html_1) first_link_xpath = content_html.xpath('//TR') print (first_link_xpath) A small part of my HTML file is shown below: <TR> <TR vAlign="top" align="left"> <!--<TD><B onmouseover="tips.Display('Metadata_WEB', event)" onmouseout="tips

python setuptool how can I add dependency for libxml2-dev and libxslt1-dev?

阅读更多关于 python setuptool how can I add dependency for libxml2-dev and libxslt1-dev?

问题 My application needs lxml >= 2.1, but to install lxml its requied to install libxml2-dev libxslt1-dev else it raises error while installing the lxml, is there a way that using python setup tool I can give this as dependency in my setup.py.... 回答1: Not really ... setuptools only handle dependencies on package wich belongs already to pypi. So if you want these kind of dependencies, i think that you have to select the packaging technology brought by your favorite distribution. But, you can

How to use a variable in LXML XPath Expression

阅读更多关于 How to use a variable in LXML XPath Expression

问题 I'm using Python 3.3 in eclipse with PyDev plugin on Windows 7. I need to parse an XML file using XPath and LXML. If I use a static XPath expression it works but I need to use a variable one but when I use a variable in the expression it doesn't work. If I use this code: xml = etree.parse(fullpath).getroot() tree = etree.ElementTree(xml) nsmap = {'xis' : 'http://www.xchanging.com/ACORD4ALLEDI/1', 'ns' : 'http://www.ACORD.org/standards/Jv-Ins-Reinsurance/1' } p = tree.xpath('//xis:Line',