lxml

How can I set up lxml and pypy on Yosemite?

左心房为你撑大大i 提交于 2019-12-20 05:19:06
问题 I wanted to do some learning with lxml and pypy, so I decided to get it set up on my Yosemite Mac. But after three days of trying, I still haven't been able to try lxml, because I can't get my setup right. Here's what I've done: Did a clean homebrew and xcode-select --install install proix:~ user$ brew --version 0.9.5 proix:~ user$ gcc --version Configured with: --prefix=/Applications/Xcode.app/Contents/Developer/usr --with-gxx-include-dir=/usr/include/c++/4.2.1 Apple LLVM version 6.0 (clang

XML pretty print fails in Python lxml

一个人想着一个人 提交于 2019-12-20 03:42:09
问题 I am trying to read, modify, and write an XML file with lxml 4.1.1 in Python 2.7.6. My code: import lxml.etree as et fn_xml_in = 'in.xml' parser = et.XMLParser(remove_blank_text=True) xml_doc = et.parse(fn_xml_in, parser) xml_doc.getroot().find('b').append(et.Element('c')) xml_doc.write('out.xml', method='html', pretty_print=True) The input file in.xml looks like this: <a> <b/> </a> And the produced output file out.xml : <a> <b><c></c></b> </a> Or when I set remove_blank_text=True : <a><b><c>

lxml.etree insert elements into element.text

。_饼干妹妹 提交于 2019-12-20 03:38:04
问题 I have strings that have empty xml elements in them, like this: >>> s = """fizz buzz <pb n="44"/> bananas""" These strings have been assigned to xml elements using the etree.SubElement method: >>> from lxml import etree as et >>> root = et.Element('root') >>> txt = et.SubElement(root, 'text') >>> txt.text = s >>> et.dump(root) <root> <text>fizz buzz <pb n="44"/> bananas</text> </root> Fiddling about with re.split() and etree's text and tail I can insert a subelement <pb n="44"/> where I want

lxml unicode entity parse problems

我是研究僧i 提交于 2019-12-20 03:33:11
问题 I'm using lxml as follows to parse an exported XML file from another system: xmldoc = open(filename) etree.parse(xmldoc) But im getting: lxml.etree.XMLSyntaxError: Entity 'eacute' not defined, line 4495, column 46 Obviously it's having problems with unicode entity names - but how would i get round this? Via open() or parse()? Edit: I had forgotten to include my DTD in the same folder - it's there now and has the following declaration: <!ENTITY eacute "é"> and is referred to (and always was)

lxml error “IOError: Error reading file” when parsing facebook mobile in a python scraper script

坚强是说给别人听的谎言 提交于 2019-12-20 02:45:23
问题 I use a modified script from Logging into facebook with python post : #!/usr/bin/python2 -u # -*- coding: utf8 -*- facebook_email = "YOUR_MAIL@DOMAIN.TLD" facebook_passwd = "YOUR_PASSWORD" import cookielib, urllib2, urllib, time, sys from lxml import etree jar = cookielib.CookieJar() cookie = urllib2.HTTPCookieProcessor(jar) opener = urllib2.build_opener(cookie) headers = { "User-Agent" : "Mozilla/5.0 (iPhone; U; CPU iPhone OS 4_0 like Mac OS X; en-us) AppleWebKit/532.9 (KHTML, like Gecko)

python lxml findall with multiple namespaces

偶尔善良 提交于 2019-12-20 02:37:08
问题 I'm trying to parse an XML document with multiple namespaces with lxml, and I'm stuck on getting the findall() method to return something. My XML: <MeasurementRecords xmlns="http://www.company.com/common/rsp/2012/07" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.company.com/common/rsp/2012/07 RSP_EWS_V1.6.xsd"> <HistoryRecords> <ValueItemId>100_0000100004_3788_Resource-0.customId_WSx Data Precip Type</ValueItemId> <List> <HistoryRecord> <Value>60</Value>

lxml.etree._Element.append() from a loop not working as expected

坚强是说给别人听的谎言 提交于 2019-12-20 02:24:26
问题 I would like to know why in this code append() seems to work from inside the loop, but the resulting xml displays the modification from only the last iteration, while remove() works as expected. This is a overly simplified example, I'm working with big chunks of data, and need to append the same subtree to many different parents. from lxml import etree xml = etree.fromstring('<tree><fruit id="1"></fruit><fruit id="2"></fruit></tree>') sub = etree.fromstring('<apple/>') for i, item in

Scrapy: Unable to create a project

坚强是说给别人听的谎言 提交于 2019-12-20 01:59:45
问题 I had issues installing scrapy respect to lxml but then I found some information on stackoverflow. Based on that information I did a sudo easy_install lxml with some error I think scrapy got install: Reason I came to that judgement is that I repel I could do following: Python 2.7.5 (default, Jul 28 2013, 07:27:04) [GCC 4.2.1 Compatible Apple LLVM 4.2 (clang-425.0.28)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> from scrapy import * >>> But when I try

Scrapy: Unable to create a project

一曲冷凌霜 提交于 2019-12-20 01:59:22
问题 I had issues installing scrapy respect to lxml but then I found some information on stackoverflow. Based on that information I did a sudo easy_install lxml with some error I think scrapy got install: Reason I came to that judgement is that I repel I could do following: Python 2.7.5 (default, Jul 28 2013, 07:27:04) [GCC 4.2.1 Compatible Apple LLVM 4.2 (clang-425.0.28)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> from scrapy import * >>> But when I try

from scrapy.selector import selector error

拟墨画扇 提交于 2019-12-20 01:10:16
问题 I am unable to do the following: from scrapy.selector import Selector The error is: File "/Desktop/KSL/KSL/spiders/spider.py", line 1, in from scrapy.selector import Selector ImportError: cannot import name Selector It is as if LXML is not installed on my machine, but it is. Also, I thought this was a default module built into scrapy. Maybe not? Thoughts? 回答1: Try importing HtmlXPathSelector instead. from scrapy.selector import HtmlXPathSelector And then use the .select() method to parse out