lxml | 易学教程

Python: Modifying an XML File

阅读更多关于 Python: Modifying an XML File

I'm stuck, I've written a code that looks for specific index in xml file. But when find that Index won't create me a new xml file with just that Index in and constant parameters. it returns an error: ...rba_u_xml.py", line 29, in <module> ObjectDictionary.remove(Variable) File "C:\Python27\lib\xml\etree\ElementTree.py", line 337, in remove self._children.remove(element) ValueError: list.remove(x): x not in list this is my code: import xml.etree.ElementTree as ET tree = ET.parse('master.xml') root = tree.getroot() s = input('Insert a number of index and add quotes(") befor and after: ') i = int

Writing a custom XML file for the Wordpress Importer using lxml

阅读更多关于 Writing a custom XML file for the Wordpress Importer using lxml

Okay, so here is my current situation: My knowledge of XML or lxml isn't very good yet, since I rarely used XML files until now. So please tell me if something in my approach to this is really stupid. ;-) I want to feed my Wordpress installation a custom XML file, using the Wordpress importer. The Default Format can be seen here: XML File Now there are some tags looking like this <wp:author> I am not a hundred percent sure, but as far as I learned today, the wp: part of the tag is the namespace. When I tried to use lxml to create those Tags I did this author = etree.Element("wp:author") This

lxml not getting installed on AWS Elasticbeanstalk instance

阅读更多关于 lxml not getting installed on AWS Elasticbeanstalk instance

问题 I used lxml module in my code to parse AWS response. Locally it works awesome, but when i deploy this to AWS elasticbean instance, it throws errors against lxml. I tried these solutions: included lxml to requirements.txt and it failed. I accessed AWS instance n tried to install it directly and it failed. I put the below line in .ebextensions/02_python.config. 09_lxml: command: "wget http://lxml.de/files/lxml-3.3.4.tgz && tar -xzvf lxml-3.3.4.tgz && cd lxml-3.3.4 && /opt/python/run/venv/bin

Python lxml: insert text at given position relatively to subelements

阅读更多关于 Python lxml: insert text at given position relatively to subelements

问题 I'd like to build the following XML element (in order to customize figure number formatting): <figcaption> <span class="fignum">Figura 1.2</span> - Description of figure. </figcaption> but I don't know how to specify position of text. In fact, if I create the subelement before creating text, import lxml.etree as et fc = et.Element("figcaption") fn = et.SubElement(fc, "span", {'class':'fignum'}) fn.text = "Figure 1.2" fc.text = " - Description of figure." I get an undesired result (text is

centos6.4 pip 安装openERP7.0

阅读更多关于 centos6.4 pip 安装openERP7.0

在鼓捣openERP玩碰到了几个安装问题，记录下进入虚拟环境mkvirtuenv openerp 官网下来tar包，pip install -e XXX安装就好。但是会有几个问题 1.源经常会断，换了个国内源，豆瓣跟v2ex都可以在 ~/.pip/ 下创建文件 pip.conf（如果还没有的话），并填入以下内容： [global] index-url = http://pypi.v2ex.com/simple 2.lxml老是编译失败 libxslt-devel跟libxml2-devel,不过只要装了libxslt-devel会自动把依赖包libxml2-devel安装的 yum install libxslt-devel pip install lxml 3.python-ldap编译失败查了下，原来他基于的是openldap,不是ldap sudo yum install python - devel sudo yum install openldap - devel pip install python-ldap 4.创建postgreSQL数据库 (OPENERP)[quanpower@Y400 .pip]$ su postgres Password: (OPENERP)[postgres@Y400 .pip]$ psql could not change

Preserving XML attribute order?

阅读更多关于 Preserving XML attribute order?

问题 I know this question has been asked in the past, but they have all been dated a few years back. I am wondering if there has been any changes made to Python modules such as lxml, minidom, or etree that will allow us to preserve the attribute order in XML files without patching. I need the order to be preserved as the program I am supplying the files to relies on it. If there are no updates, what's the easiest way to implement this? 回答1: The insignificance of attribute ordering is not a

lipo: can't figure out the architecture type of: /var/folders/

阅读更多关于 lipo: can't figure out the architecture type of: /var/folders/

问题 I tried installing lxml on Mac OSX Snowleopard and keep getting the error: lipo: can't figure out the architecture type of: /var/folders/ I did install XCode with 10.4 SDK support and I changed gcc 4.2 to 4.0.1 Any clues??? Python 2.6.1 with Leopard 1.6.7.. running install running bdist_egg running egg_info writing src/lxml.egg-info/PKG-INFO writing top-level names to src/lxml.egg-info/top_level.txt writing dependency_links to src/lxml.egg-info/dependency_links.txt reading manifest file 'src

python alexa result parsing with lxml.etree

阅读更多关于 python alexa result parsing with lxml.etree

I am using alexa api from aws but I find difficult in parse the result to get what I want alexa api return an object tree <type 'lxml.etree._ElementTree'> I use this code to print the tree from lxml import etree root = tree.getroot() print etree.tostring(root) I get xml below <aws:UrlInfoResponse xmlns:aws="http://alexa.amazonaws.com/doc/2005-10-05/"><aws:Response xmlns:aws="http://awis.amazonaws.com/doc/2005-07-11"><aws:OperationRequest><aws:RequestId>ccf3f263-ab76-ab63-db99-244666044e85</aws:RequestId></aws:OperationRequest><aws:UrlInfoResult><aws:Alexa> <aws:ContentData> <aws:DataUrl type=

Regular Expressions to parse template tags in XML

阅读更多关于 Regular Expressions to parse template tags in XML

问题 I need to parse some XML to pull out embedded template tags for further parsing. I can't seem to bend Python's regular expressions to do what I want, though. In English: when a template tag is contained anywhere in the row, remove all the XML for that specific row and leave only the template tag in its place. I put together a test case to demonstrate. Here's the original XML:  <w:tbl> <w:tr> <w:tc><w:t>Header 1</w:t></w:tc> <w:tc><w:t>Header 2</w:t></w:tc> <w:tc><w:t

Is there a way to recover iterparse on invalid Char values?

阅读更多关于 Is there a way to recover iterparse on invalid Char values?

I'm using lxml's iterparse to parse some big XML files (3-5Gig). Since some of these files have invalid characters a lxml.etree.XMLSyntaxError is thrown. When using lxml.etree.parse I can provide a parser which recovers on invalid characters: parser = lxml.etree.XMLParser(recover=True) root = lxml.etree.parse(open("myMalformed.xml, parser) Is there a way to get the same functionality for iterparse? Edit: Encoding is not an Issue here. There are invalid characters in these XML files which can be sanitized by defining a XMLParser with recover=True. Since I need to use iterparse for this, I can't