lxml

Python: Modifying an XML File

﹥>﹥吖頭↗ 提交于 2019-12-02 10:11:45
I'm stuck, I've written a code that looks for specific index in xml file. But when find that Index won't create me a new xml file with just that Index in and constant parameters. it returns an error: ...rba_u_xml.py", line 29, in <module> ObjectDictionary.remove(Variable) File "C:\Python27\lib\xml\etree\ElementTree.py", line 337, in remove self._children.remove(element) ValueError: list.remove(x): x not in list this is my code: import xml.etree.ElementTree as ET tree = ET.parse('master.xml') root = tree.getroot() s = input('Insert a number of index and add quotes(") befor and after: ') i = int

Writing a custom XML file for the Wordpress Importer using lxml

吃可爱长大的小学妹 提交于 2019-12-02 09:53:55
Okay, so here is my current situation: My knowledge of XML or lxml isn't very good yet, since I rarely used XML files until now. So please tell me if something in my approach to this is really stupid. ;-) I want to feed my Wordpress installation a custom XML file, using the Wordpress importer. The Default Format can be seen here: XML File Now there are some tags looking like this <wp:author> I am not a hundred percent sure, but as far as I learned today, the wp: part of the tag is the namespace. When I tried to use lxml to create those Tags I did this author = etree.Element("wp:author") This

lxml not getting installed on AWS Elasticbeanstalk instance

拥有回忆 提交于 2019-12-02 09:48:16
问题 I used lxml module in my code to parse AWS response. Locally it works awesome, but when i deploy this to AWS elasticbean instance, it throws errors against lxml. I tried these solutions: included lxml to requirements.txt and it failed. I accessed AWS instance n tried to install it directly and it failed. I put the below line in .ebextensions/02_python.config. 09_lxml: command: "wget http://lxml.de/files/lxml-3.3.4.tgz && tar -xzvf lxml-3.3.4.tgz && cd lxml-3.3.4 && /opt/python/run/venv/bin

Python lxml: insert text at given position relatively to subelements

假装没事ソ 提交于 2019-12-02 07:30:01
问题 I'd like to build the following XML element (in order to customize figure number formatting): <figcaption> <span class="fignum">Figura 1.2</span> - Description of figure. </figcaption> but I don't know how to specify position of text. In fact, if I create the subelement before creating text, import lxml.etree as et fc = et.Element("figcaption") fn = et.SubElement(fc, "span", {'class':'fignum'}) fn.text = "Figure 1.2" fc.text = " - Description of figure." I get an undesired result (text is

centos6.4 pip 安装openERP7.0

有些话、适合烂在心里 提交于 2019-12-02 07:26:50
在鼓捣openERP玩碰到了几个安装问题,记录下 进入虚拟环境mkvirtuenv openerp 官网下来tar包,pip install -e XXX安装就好。但是会有几个问题 1.源经常会断,换了个国内源,豆瓣跟v2ex都可以 在 ~/.pip/ 下创建文件 pip.conf(如果还没有的话),并填入以下内容: [global] index-url = http://pypi.v2ex.com/simple 2.lxml老是编译失败 libxslt-devel跟libxml2-devel,不过只要装了libxslt-devel会自动把依赖包libxml2-devel安装的 yum install libxslt-devel pip install lxml 3.python-ldap编译失败 查了下,原来他基于的是openldap,不是ldap sudo yum install python - devel sudo yum install openldap - devel pip install python-ldap 4.创建postgreSQL数据库 (OPENERP)[quanpower@Y400 .pip]$ su postgres Password: (OPENERP)[postgres@Y400 .pip]$ psql could not change

Preserving XML attribute order?

本秂侑毒 提交于 2019-12-02 07:21:03
问题 I know this question has been asked in the past, but they have all been dated a few years back. I am wondering if there has been any changes made to Python modules such as lxml, minidom, or etree that will allow us to preserve the attribute order in XML files without patching. I need the order to be preserved as the program I am supplying the files to relies on it. If there are no updates, what's the easiest way to implement this? 回答1: The insignificance of attribute ordering is not a

lipo: can't figure out the architecture type of: /var/folders/

…衆ロ難τιáo~ 提交于 2019-12-02 07:09:24
问题 I tried installing lxml on Mac OSX Snowleopard and keep getting the error: lipo: can't figure out the architecture type of: /var/folders/ I did install XCode with 10.4 SDK support and I changed gcc 4.2 to 4.0.1 Any clues??? Python 2.6.1 with Leopard 1.6.7.. running install running bdist_egg running egg_info writing src/lxml.egg-info/PKG-INFO writing top-level names to src/lxml.egg-info/top_level.txt writing dependency_links to src/lxml.egg-info/dependency_links.txt reading manifest file 'src

python alexa result parsing with lxml.etree

半世苍凉 提交于 2019-12-02 06:19:35
I am using alexa api from aws but I find difficult in parse the result to get what I want alexa api return an object tree <type 'lxml.etree._ElementTree'> I use this code to print the tree from lxml import etree root = tree.getroot() print etree.tostring(root) I get xml below <aws:UrlInfoResponse xmlns:aws="http://alexa.amazonaws.com/doc/2005-10-05/"><aws:Response xmlns:aws="http://awis.amazonaws.com/doc/2005-07-11"><aws:OperationRequest><aws:RequestId>ccf3f263-ab76-ab63-db99-244666044e85</aws:RequestId></aws:OperationRequest><aws:UrlInfoResult><aws:Alexa> <aws:ContentData> <aws:DataUrl type=

Regular Expressions to parse template tags in XML

倖福魔咒の 提交于 2019-12-02 05:10:34
问题 I need to parse some XML to pull out embedded template tags for further parsing. I can't seem to bend Python's regular expressions to do what I want, though. In English: when a template tag is contained anywhere in the row, remove all the XML for that specific row and leave only the template tag in its place. I put together a test case to demonstrate. Here's the original XML: <!-- regex_trial.xml --> <w:tbl> <w:tr> <w:tc><w:t>Header 1</w:t></w:tc> <w:tc><w:t>Header 2</w:t></w:tc> <w:tc><w:t

Is there a way to recover iterparse on invalid Char values?

删除回忆录丶 提交于 2019-12-02 04:43:06
I'm using lxml's iterparse to parse some big XML files (3-5Gig). Since some of these files have invalid characters a lxml.etree.XMLSyntaxError is thrown. When using lxml.etree.parse I can provide a parser which recovers on invalid characters: parser = lxml.etree.XMLParser(recover=True) root = lxml.etree.parse(open("myMalformed.xml, parser) Is there a way to get the same functionality for iterparse? Edit: Encoding is not an Issue here. There are invalid characters in these XML files which can be sanitized by defining a XMLParser with recover=True. Since I need to use iterparse for this, I can't