elementtree

The element tree xml

倖福魔咒の 提交于 2019-12-10 22:22:00
问题 I can't figure why I get an error while trying to reach the timestamp. XML format (left out some attributes): EDIT: this is the actual type of the xml file. <mediawiki xmlns="http://www.mediawiki.org/xml/export-0.10/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.mediawiki.org/xml/export-0.10/ http://www.mediawiki.org/xml/export-0.10.xsd" version="0.10" xml:lang="en"> <siteinfo> <sitename>Wikipedia</sitename> <dbname>enwiki</dbname> <base>https://en

python ElementTree the text of element who has a child

守給你的承諾、 提交于 2019-12-10 21:47:17
问题 When I try to read a text of a element who has a child, it gives None: See the xml (say test.xml): <?xml version="1.0"?> <data> <test><ref>MemoryRegion</ref> abcd</test> </data> and the python code that wants to read 'abcd': import xml.etree.ElementTree as ET tree = ET.parse('test.xml') root = tree.getroot() print root.find("test").text When I run this python, it gives None, rather than abcd. How can I read abcd under this condition? 回答1: Use Element.tail attribute: >>> import xml.etree

Python ElementTree: How to add SubElement at VERY specific position?

十年热恋 提交于 2019-12-10 21:44:38
问题 I want to add a subelement to an xml file, but in a very specific position, not appended to the end. The standard way is: subi = ET.SubElement(root[0][0], 'subi') which is fine. but: Let's say, root[0][0] already has two children, hence accessible via root[0][0][0] and root[0][0][1]. And I want "subi" to become the new middle child, root[0][0][1], making the original second child become the third child root[0][0][2]. Is there a way to do that? (My experiences with life and nature would say no

Extracting page titles and contributors from MediaWiki XML

China☆狼群 提交于 2019-12-10 21:34:42
问题 I have a very large (7GB) MediaWiki XML dump, which consists of records of each change made to each page of the Wiki. I am trying to record which users have contributed to each page, and so I want to extract that from the XML. The XML looks something like: <mediawiki xmlns="http://www.mediawiki.org/xml/export-0.3/"> <page> <title>Unique Page title</title> <id>11</id> <restrictions>sysop</restrictions> <revision> <id>11</id> <timestamp>2005-10-26T02:23:03Z</timestamp> <contributor> <ip

python etree with xpath and namespaces with prefix

左心房为你撑大大i 提交于 2019-12-10 21:31:19
问题 I can't find info, how to parse my XML with namespace: I have this xml: <par:Request xmlns:par="http://somewhere.net/actual"> <par:actual>blabla</par:actual> <par:documentType>string</par:documentType> </par:Request> And tried to parse it: dom = ET.parse(u'C:\\filepath\\1.xml') rootxml = dom.getroot() for subtag in rootxml.xpath(u'//par:actual'): #do something print(subtag) And got exception, because it doesn't know about namespace prefix. Is there best way to solve that problem, counting

changing element namespace in lxml

浪子不回头ぞ 提交于 2019-12-10 19:48:59
问题 With lxml , I am not sure how to properly remove the namespace of an existing element and set a new one. For instance, I'm parsing this minimal xml file: <myroot xmlns="http://myxml.com/somevalue"> <child1>blabla</child1> <child2>blablabla</child2> </myroot> ... and I'd like it to become: <myroot xmlns="http://myxml.com/newvalue"> <child1>blabla/child1> <child2>blablabla</child2> </myroot> With lxml : from lxml import etree as ET tree = ET.parse('myfile.xml') root= tree.getroot() If I inspect

Python + Expat: Error on &#0; entities

◇◆丶佛笑我妖孽 提交于 2019-12-10 14:19:41
问题 I have written a small function, which uses ElementTree and xpath to extract the text contents of certain elements in an xml file: #!/usr/bin/env python2.5 import doctest from xml.etree import ElementTree from StringIO import StringIO def parse_xml_etree(sin, xpath): """ Takes as input a stream containing XML and an XPath expression. Applies the XPath expression to the XML and returns a generator yielding the text contents of each element returned. >>> parse_xml_etree( ... StringIO('<test>

SQLAlchemy TypeDecorator doesn't work

不打扰是莪最后的温柔 提交于 2019-12-10 13:55:42
问题 I'm using xml in my postgresql database and I need a custom type could handle xml data in SQLAlchemy. So I made XMLType class communicating with xml.etree , but It doesn't work as I wished. here`s the code that I wrote: import xml.etree.ElementTree as etree class XMLType(sqlalchemy.types.TypeDecorator): impl = sqlalchemy.types.UnicodeText type = etree.Element def get_col_spec(self): return 'xml' def bind_processor(self, dialect): def process(value): if value is not None: return etree.dump

How to parse file XML by lxml, get element & attribute?

匆匆过客 提交于 2019-12-10 11:48:35
问题 I have a xml description like this: <Car xmlns="http://example.com/vocab/xml/cars#"> <dateStarted>{{date_started|escape}}</dateStarted> <dateSold>{{date_sold|escape}}</dateSold> <name type="{{name_type}}" abbrev="{{name_abbrev}}" value="{{name_value}}" >{{name|escape}}</name> <brandName type="{{brand_name_type}}" abbrev="{{brand_name_abbrev}}" value="{{brand_name_value}}" >{{brand_name|escape}}</brandName> <maxspeed> <value>{{speed_value}}</value> <unit type="{{speed_unit_type}}" value="{

Converting an xml doc into a specific dot-expanded json structure

偶尔善良 提交于 2019-12-10 10:08:25
问题 I have the following XML document: <Item ID="288917"> <Main> <Platform>iTunes</Platform> <PlatformID>353736518</PlatformID> </Main> <Genres> <Genre FacebookID="6003161475030">Comedy</Genre> <Genre FacebookID="6003172932634">TV-Show</Genre> </Genres> <Products> <Product Country="CA"> <URL>https://itunes.apple.com/ca/tv-season/id353187108?i=353736518</URL> <Offers> <Offer Type="HDBUY"> <Price>3.49</Price> <Currency>CAD</Currency> </Offer> <Offer Type="SDBUY"> <Price>2.49</Price> <Currency>CAD<