Read GPX using Python ElementTree.register_namespace?

问题

I have been beating my head against the wall for some time now. According to the documentation, this should be simple. All I want to do is read a GPX file. However, GPX files liberally use namespaces, which theoretically make sense. I cannot seem to get ElementTree to read them, though. Here is the code I am trying to use...

def loadGpx(self, sourceFile):
    ElementTree.register_namespace('gpx', 'http://www.topografix.com/GPX/1/1')
    eTree = ElementTree.ElementTree()
    eTree.parse(sourceFile)

    print eTree.findall('wpt')

To pull out waypoints from a GPX file like this...

<?xml version="1.0" encoding="utf-8"?>
<gpx creator="Garmin Desktop App" version="1.1" 
    xsi:schemaLocation="http://www.topografix.com/GPX/1/1 
                    http://www.topografix.com/GPX/1/1/gpx.xsd 
                    http://www.garmin.com/xmlschemas/WaypointExtension/v1 
                    http://www8.garmin.com/xmlschemas/WaypointExtensionv1.xsd 
                    http://www.garmin.com/xmlschemas/TrackPointExtension/v1 
                    http://www.garmin.com/xmlschemas/TrackPointExtensionv1.xsd 
                    http://www.garmin.com/xmlschemas/GpxExtensions/v3 
                    http://www8.garmin.com/xmlschemas/GpxExtensionsv3.xsd 
                    http://www.garmin.com/xmlschemas/ActivityExtension/v1 
                    http://www8.garmin.com/xmlschemas/ActivityExtensionv1.xsd 
                    http://www.garmin.com/xmlschemas/AdventuresExtensions/v1 
                    http://www8.garmin.com/xmlschemas/AdventuresExtensionv1.xsd" 
    xmlns="http://www.topografix.com/GPX/1/1" 
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
    xmlns:wptx1="http://www.garmin.com/xmlschemas/WaypointExtension/v1" 
    xmlns:gpxtrx="http://www.garmin.com/xmlschemas/GpxExtensions/v3" 
    xmlns:gpxtpx="http://www.garmin.com/xmlschemas/TrackPointExtension/v1" 
    xmlns:gpxx="http://www.garmin.com/xmlschemas/GpxExtensions/v3" 
    xmlns:abp="http://www.garmin.com/xmlschemas/ActivityExtension/v1" 
    xmlns:adv="http://www.garmin.com/xmlschemas/AdventuresExtensions/v1">

    <metadata>
        <link href="http://www.garmin.com">
          <text>Garmin International</text>
        </link>
        <time>2012-01-17T03:21:12Z</time>
        <bounds maxlat="45.708811283111572" maxlon="-121.3884991966188" 
                minlat="45.407062936574221" minlon="-121.54939779080451" />
    </metadata>

  <wpt lat="45.708682453259826" lon="-121.51224257424474">
    <time>2012-01-06T19:00:02Z</time>
    <name>1-State and First, start MHL</name>
    <sym>Bike Trail</sym>
    <extensions>
      <gpxx:WaypointExtension>
        <gpxx:DisplayMode>SymbolAndName</gpxx:DisplayMode>
      </gpxx:WaypointExtension>
    </extensions>
  </wpt>

  <wpt lat="45.615267734974623" lon="-121.43857721239328">
    <time>2012-01-07T15:38:14Z</time>
    <name>10-Right at fork staying on Huskey Rd</name>
    <sym>Bike Trail</sym>
    <extensions>
      <gpxx:WaypointExtension>
        <gpxx:DisplayMode>SymbolAndName</gpxx:DisplayMode>
      </gpxx:WaypointExtension>
    </extensions>
  </wpt>

True, it will take more than just print eTree.findall('wpt'), but if I can get that far, I have worked with xml before. That part is easy. This namespace thing though, is driving me nuts.

I thank you in advance. This is driving me nuts.

回答1:

register_namespace() controls the prefixes used when serializing XML, but it does not affect parsing.

With ElementTree, do it like this:

from xml.etree import ElementTree as ET

tree = ET.parse("gpx.xml")
for elem in tree.findall("{http://www.topografix.com/GPX/1/1}wpt"):
    print elem

Resulting output:

<Element '{http://www.topografix.com/GPX/1/1}wpt' at 0x201c550>
<Element '{http://www.topografix.com/GPX/1/1}wpt' at 0x201c730>

With lxml, you can also use this:

from lxml import etree

NSMAP = {"gpx": "http://www.topografix.com/GPX/1/1"}

tree = etree.parse("gpx.xml")
for elem in tree.findall("gpx:wpt", namespaces=NSMAP):
    print elem

回答2:

Why don't you just use an existing GPX library?

shameless plug: With gpxpy https://github.com/tkrajina/gpxpy parsing waypoints from your file works perfectly:

import gpxpy

gpx_sample = """...your GPX sample here..."""

gpx = gpxpy.parse(gpx_sample)

for wpt in gpx.waypoints:
    print wpt.latitude, wpt.longitude

Even if you don't want to use the library you can just check the code to see how it parses the XML file.

来源：https://stackoverflow.com/questions/18071387/read-gpx-using-python-elementtree-register-namespace

标签

python-2.7

elementtree

gpx