elementtree | 易学教程

The element tree xml

阅读更多关于 The element tree xml

问题 I can't figure why I get an error while trying to reach the timestamp. XML format (left out some attributes): EDIT: this is the actual type of the xml file. <mediawiki xmlns="http://www.mediawiki.org/xml/export-0.10/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.mediawiki.org/xml/export-0.10/ http://www.mediawiki.org/xml/export-0.10.xsd" version="0.10" xml:lang="en"> <siteinfo> <sitename>Wikipedia</sitename> <dbname>enwiki</dbname> <base>https://en

python ElementTree the text of element who has a child

阅读更多关于 python ElementTree the text of element who has a child

问题 When I try to read a text of a element who has a child, it gives None: See the xml (say test.xml): <?xml version="1.0"?> <data> <test><ref>MemoryRegion</ref> abcd</test> </data> and the python code that wants to read 'abcd': import xml.etree.ElementTree as ET tree = ET.parse('test.xml') root = tree.getroot() print root.find("test").text When I run this python, it gives None, rather than abcd. How can I read abcd under this condition? 回答1: Use Element.tail attribute: >>> import xml.etree

Python ElementTree: How to add SubElement at VERY specific position?

阅读更多关于 Python ElementTree: How to add SubElement at VERY specific position?

问题 I want to add a subelement to an xml file, but in a very specific position, not appended to the end. The standard way is: subi = ET.SubElement(root[0][0], 'subi') which is fine. but: Let's say, root[0][0] already has two children, hence accessible via root[0][0][0] and root[0][0][1]. And I want "subi" to become the new middle child, root[0][0][1], making the original second child become the third child root[0][0][2]. Is there a way to do that? (My experiences with life and nature would say no

Extracting page titles and contributors from MediaWiki XML

阅读更多关于 Extracting page titles and contributors from MediaWiki XML

问题 I have a very large (7GB) MediaWiki XML dump, which consists of records of each change made to each page of the Wiki. I am trying to record which users have contributed to each page, and so I want to extract that from the XML. The XML looks something like: <mediawiki xmlns="http://www.mediawiki.org/xml/export-0.3/"> <page> <title>Unique Page title</title> <id>11</id> <restrictions>sysop</restrictions> <revision> <id>11</id> <timestamp>2005-10-26T02:23:03Z</timestamp> <contributor> <ip

python etree with xpath and namespaces with prefix

阅读更多关于 python etree with xpath and namespaces with prefix

问题 I can't find info, how to parse my XML with namespace: I have this xml: <par:Request xmlns:par="http://somewhere.net/actual"> <par:actual>blabla</par:actual> <par:documentType>string</par:documentType> </par:Request> And tried to parse it: dom = ET.parse(u'C:\\filepath\\1.xml') rootxml = dom.getroot() for subtag in rootxml.xpath(u'//par:actual'): #do something print(subtag) And got exception, because it doesn't know about namespace prefix. Is there best way to solve that problem, counting

changing element namespace in lxml

阅读更多关于 changing element namespace in lxml

问题 With lxml , I am not sure how to properly remove the namespace of an existing element and set a new one. For instance, I'm parsing this minimal xml file: <myroot xmlns="http://myxml.com/somevalue"> <child1>blabla</child1> <child2>blablabla</child2> </myroot> ... and I'd like it to become: <myroot xmlns="http://myxml.com/newvalue"> <child1>blabla/child1> <child2>blablabla</child2> </myroot> With lxml : from lxml import etree as ET tree = ET.parse('myfile.xml') root= tree.getroot() If I inspect

Python + Expat: Error on  entities

阅读更多关于 Python + Expat: Error on  entities

问题 I have written a small function, which uses ElementTree and xpath to extract the text contents of certain elements in an xml file: #!/usr/bin/env python2.5 import doctest from xml.etree import ElementTree from StringIO import StringIO def parse_xml_etree(sin, xpath): """ Takes as input a stream containing XML and an XPath expression. Applies the XPath expression to the XML and returns a generator yielding the text contents of each element returned. >>> parse_xml_etree( ... StringIO('<test>

SQLAlchemy TypeDecorator doesn't work

阅读更多关于 SQLAlchemy TypeDecorator doesn't work

问题 I'm using xml in my postgresql database and I need a custom type could handle xml data in SQLAlchemy. So I made XMLType class communicating with xml.etree , but It doesn't work as I wished. here`s the code that I wrote: import xml.etree.ElementTree as etree class XMLType(sqlalchemy.types.TypeDecorator): impl = sqlalchemy.types.UnicodeText type = etree.Element def get_col_spec(self): return 'xml' def bind_processor(self, dialect): def process(value): if value is not None: return etree.dump

How to parse file XML by lxml, get element & attribute?

阅读更多关于 How to parse file XML by lxml, get element & attribute?

问题 I have a xml description like this: <Car xmlns="http://example.com/vocab/xml/cars#"> <dateStarted>{{date_started|escape}}</dateStarted> <dateSold>{{date_sold|escape}}</dateSold> <name type="{{name_type}}" abbrev="{{name_abbrev}}" value="{{name_value}}" >{{name|escape}}</name> <brandName type="{{brand_name_type}}" abbrev="{{brand_name_abbrev}}" value="{{brand_name_value}}" >{{brand_name|escape}}</brandName> <maxspeed> <value>{{speed_value}}</value> <unit type="{{speed_unit_type}}" value="{

Converting an xml doc into a specific dot-expanded json structure

阅读更多关于 Converting an xml doc into a specific dot-expanded json structure

问题 I have the following XML document: <Item ID="288917"> <Main> <Platform>iTunes</Platform> <PlatformID>353736518</PlatformID> </Main> <Genres> <Genre FacebookID="6003161475030">Comedy</Genre> <Genre FacebookID="6003172932634">TV-Show</Genre> </Genres> <Products> <Product Country="CA"> <URL>https://itunes.apple.com/ca/tv-season/id353187108?i=353736518</URL> <Offers> <Offer Type="HDBUY"> <Price>3.49</Price> <Currency>CAD</Currency> </Offer> <Offer Type="SDBUY"> <Price>2.49</Price> <Currency>CAD<