lxml | 易学教程

Use a single site package (as exception) for a virtualenv

阅读更多关于 Use a single site package (as exception) for a virtualenv

问题 In a virtualenv, how can I ignore the no-site-packages rule for a single package? Some background: I use virtualenv for my deployments, but these take a lot longer since I have been using lxml . Compiling this takes up to 15 minutes each time I reinstall for a new virtualenv. Can I make some sort of an exception for lxml and use the global site package? Is there any safer/more reliable option than just copying it into the new virtualenv? 回答1: Short answer: no, but you can do something else to

how to make python request.get wait a few seconds?

阅读更多关于 how to make python request.get wait a few seconds?

问题 I wanted to do get some experience with html crawling, so I wanted to see if I could grab some values of the following site: http://www.iex.nl/Aandeel-Koers/11890/Royal-Imtech/koers.aspx This site shows the price of imtech shares. If you take a look at the site, you see there is 1 number shown in bold, this is the price of the share. As you may have seen, this price changes, and that's okay. I only want the value at the time I run my script at this point in time. but if you reload the page,

Parse paragraphs from HTML using lxml

阅读更多关于 Parse paragraphs from HTML using lxml

问题 I am new to lxml and want to extract <p>PARAGRAPHS</p> and <li>PARAGRAPHS</li> from a given url and use them for further steps. I followed an example from a post, and tried the following code with no luck: html = lxml.html('http://www.google.com/intl/en/about/corporate/index.html') url = 'http://www.google.com/intl/en/about/corporate/index.html' print html.parse.xpath('//p/text()') I tried to look into the examples in lxml.html, but didn't find any example using url. Could you give me any

Python 3.4 : How to do xml validation

阅读更多关于 Python 3.4 : How to do xml validation

问题 I'm trying to do XML validation against some XSD in python. I was successful using lxml package. But the problem starts when I tried to port my code into python 3.4. I tried to install lxml for 3.4 version. Looks like my enterprise linux doesn't play very well with lxml. pip installation: pip install lxml Collecting lxml Downloading lxml-3.4.4.tar.gz (3.5MB) 100% |################################| 3.5MB 92kB/s Installing collected packages: lxml Running setup.py install for lxml Successfully

lxml build on Solaris 10

阅读更多关于 lxml build on Solaris 10

问题 Please can you help and advise with a problem with python 2.6.6 and lxml Solaris 10 build? Installation instructions: www.sunfreeware.com/download.html direct link to the file: http://www.sunfreeware.com/ftp/pub/freeware/sparc/10/lxml-2.2.8-sol10-sparc-local.gz [rainier]/usr/apps/openet/bmsystest/relAuto/RAP_SW> python Python 2.6.6 (r266:84292, Oct 12 2010, 15:25:47) [C] on sunos5 Type "help", "copyright", "credits" or "license" for more information. >>> import lxml >>> from lxml import etree

gcc Internal error on lxml installation CentOS

阅读更多关于 gcc Internal error on lxml installation CentOS

问题 I am having some trouble installing lxml on CentOS-6 . I have tried the solutions of some similar questions like, pip install lxml error or Setup.py: install lxml with Python2.6 on CentOS but these did not work. How to install it correctly? after issuing, pip install lxml The log is like this, Downloading/unpacking lxml Running setup.py egg_info for package lxml /usr/lib64/python2.6/distutils/dist.py:266: UserWarning: Unknown distribution option: 'bugtrack_url' warnings.warn(msg) Building

Extract value from element when second namespace is used in lxml

阅读更多关于 Extract value from element when second namespace is used in lxml

问题 I am able to extract values from elements (using lxml in python 2.7) when one namespace is used. However I can't figure out how to extract values when a second namespace is used. I want to extract the value within //cc-cpl:MainClosedCaption/Id but I keep getting lxml.etree.XPathEvalError: Invalid expression errors. To be specific, the value I'm trying to exract from my sample xml is urn:uuid:6ca58b51-9116-4131-8652-feaed20dca0d Here's a snipped of the xml (from a Digital Cinema Package): <

How would you give Chrome's version of a webpage to python?

阅读更多关于 How would you give Chrome's version of a webpage to python?

问题 I'm trying to make it easy for users to input numbers from a web page. The easiest thing I can imagine would be for them to provide a url and an xpath associated with that number. My code could then go grab the numbers. The concept of an xpath isn't well-known (to non-coders), but it's trivial to find an xpath using Chrome's Inspect and Developer tools. So that's great. The problem is that xpaths from Chrome and Firefox won't always get you a working xpath for use in an html parser as

Getting XML attribute value with lxml module

阅读更多关于 Getting XML attribute value with lxml module

问题 How can i get the value of an attribute of XML file with lxml module? My XML looks like this" <process> <name>somename</name> <statistics> <stats param='someparam'> <value>0.456</value> <real_value>0.4</value> </stats> <stats ...> . . . </stats> </statistics> </process> I want to get the value 0.456 from the value attribute. I'm iterating trought the attribute and getting the text but im not sure that this is the best way for doing this for attribute in root.iter('statistics'): for stats in

Python lxml - using the xml:lang attribute to retrieve an element

阅读更多关于 Python lxml - using the xml:lang attribute to retrieve an element

问题 I have some xml which has multiple elements with the same name, but each is in a different language, for example: <Title xml:lang="FR" type="main">Les Tudors</Title> <Title xml:lang="DE" type="main">Die Tudors</Title> <Title xml:lang="IT" type="main">The Tudors</Title> Normally, I'd retrieve an element using its attributes as follows: titlex = info.find('.//xmlns:Title[@someattribute=attributevalue]', namespaces=nsmap) If I try and do this with [@xml:lang="FR"] (for example), I get the