Musing over a recently asked question, I started to wonder if there is a really simple way to deal with XML documents in Python. A pythonic way, if you will.
<If you don't mind using a 3rd party library, then BeautifulSoup will do almost exactly what you ask for:
>>> from BeautifulSoup import BeautifulStoneSoup
>>> soup = BeautifulStoneSoup('''<snip>''')
>>> soup.xml_api_reply.weather.current_conditions.temp_f['data']
u'68'
I highly recommend lxml.etree and xpath to parse and analyse your data. Here is a complete example. I have truncated the xml to make it easier to read.
import lxml.etree
s = """<?xml version="1.0" encoding="utf-8"?>
<xml_api_reply version="1">
<weather module_id="0" tab_id="0" mobile_row="0" mobile_zipped="1" row="0" section="0" >
<forecast_information>
<city data="Mountain View, CA"/> <forecast_date data="2010-06-23"/>
</forecast_information>
<forecast_conditions>
<day_of_week data="Sat"/>
<low data="59"/>
<high data="75"/>
<icon data="/ig/images/weather/partly_cloudy.gif"/>
<condition data="Partly Cloudy"/>
</forecast_conditions>
</weather>
</xml_api_reply>"""
tree = lxml.etree.fromstring(s)
for weather in tree.xpath('/xml_api_reply/weather'):
print weather.find('forecast_information/city/@data')[0]
print weather.find('forecast_information/forecast_date/@data')[0]
print weather.find('forecast_conditions/low/@data')[0]
print weather.find('forecast_conditions/high/@data')[0]
If you haven't already, I'd suggest looking into the DOM API for Python. DOM is a pretty widely used XML interpretation system, so it should be pretty robust.
It's probably a little more complicated than what you describe, but that comes from its attempts to preserve all the information implicit in XML markup rather than from bad design.
lxml has been mentioned. You might also check out lxml.objectify for some really simple manipulation.
>>> from lxml import objectify
>>> tree = objectify.fromstring(your_xml)
>>> tree.weather.attrib["module_id"]
'0'
>>> tree.weather.forecast_information.city.attrib["data"]
'Mountain View, CA'
>>> tree.weather.forecast_information.postal_code.attrib["data"]
'94043'
I found the following python-simplexml module, which in the attempts of the author to get something close to SimpleXML from PHP is indeed a small wrapper around ElementTree
. It's under 100 lines but seems to do what was requested:
>>> import SimpleXml
>>> x = SimpleXml.parse(urllib.urlopen('http://www.google.com/ig/api?weather=94043'))
>>> print x.weather.current_conditions.temp_f['data']
58
I believe that the built in python xml module will do the trick. Look at "xml.parsers.expat"
xml.parsers.expat