问题
I'm trying to develop simple Python (3.2) code to read XML files, do some corrections and store them back. However, during the storage step ElementTree adds this namespace nomenclature. For example:
<ns0:trk>
<ns0:name>ACTIVE LOG</ns0:name>
<ns0:trkseg>
<ns0:trkpt lat="38.5" lon="-120.2">
<ns0:ele>6.385864</ns0:ele>
<ns0:time>2011-12-10T17:46:30Z</ns0:time>
</ns0:trkpt>
<ns0:trkpt lat="40.7" lon="-120.95">
<ns0:ele>5.905273</ns0:ele>
<ns0:time>2011-12-10T17:46:51Z</ns0:time>
</ns0:trkpt>
<ns0:trkpt lat="43.252" lon="-126.453">
<ns0:ele>7.347168</ns0:ele>
<ns0:time>2011-12-10T17:52:28Z</ns0:time>
</ns0:trkpt>
</ns0:trkseg>
</ns0:trk>
The code snippet is below:
def parse_gpx_data(gpxdata, tzname=None, npoints=None, filter_window=None,
output_file_name=None):
ET = load_xml_library();
def find_trksegs_or_route(etree, ns):
trksegs=etree.findall('.//'+ns+'trkseg')
if trksegs:
return trksegs, "trkpt"
else: # try to display route if track is missing
rte=etree.findall('.//'+ns+'rte')
return rte, "rtept"
# try GPX10 namespace first
try:
element = ET.XML(gpxdata)
except ET.ParseError as v:
row, column = v.position
print ("error on row %d, column %d:%d" % row, column, v)
print ("%s" % ET.tostring(element))
trksegs,pttag=find_trksegs_or_route(element, GPX10)
NS=GPX10
if not trksegs: # try GPX11 namespace otherwise
trksegs,pttag=find_trksegs_or_route(element, GPX11)
NS=GPX11
if not trksegs: # try without any namespace
trksegs,pttag=find_trksegs_or_route(element, "")
NS=""
# Store the results if requested
if output_file_name:
ET.register_namespace('', GPX11)
ET.register_namespace('', GPX10)
ET.ElementTree(element).write(output_file_name, xml_declaration=True)
return;
I have tried using the register_namespace
, but with no positive result.
Are there any specific changes for this version of ElementTree 1.3?
回答1:
In order to avoid the ns0
prefix the default namespace should be set before reading the XML data.
ET.register_namespace('', "http://www.topografix.com/GPX/1/1")
ET.register_namespace('', "http://www.topografix.com/GPX/1/0")
回答2:
You need to register all your namespaces.
For example: If you have your input xml like this
<Capabilities xmlns="http://www.opengis.net/wmts/1.0"
xmlns:ows="http://www.opengis.net/ows/1.1"
xmlns:xlink="http://www.w3.org/1999/xlink"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:gml="http://www.opengis.net/gml"
xsi:schemaLocation="http://www.opengis.net/wmts/1.0 http://schemas.opengis.net/wmts/1.0/wmtsGetCapabilities_response.xsd"
version="1.0.0">
Then you have to register all the namespaces i.e attributes present with xmlns
like this:
ET.register_namespace('', "http://www.opengis.net/wmts/1.0")
ET.register_namespace('ows', "http://www.opengis.net/ows/1.1")
ET.register_namespace('xlink', "http://www.w3.org/1999/xlink")
ET.register_namespace('xsi', "http://www.w3.org/2001/XMLSchema-instance")
ET.register_namespace('gml', "http://www.opengis.net/gml")
回答3:
This answer really helped me avoid the ns0 issue. I am converting GPX from GPaws (when it works) to KML (for Google maps) and my code wasn't working until I set the default namespace like this
ET.register_namespace("","http://www.opengis.net/kml/2.2")
回答4:
It seems that you have to declare your namespace, meaning that you need to change the first line of your xml from:
<ns0:trk>
to something like:
<ns0:trk xmlns:ns0="uri:">
Once did that you will no longer get ParseError: for unbound prefix: ...
, and:
elem.tag = elem.tag[(len('{uri:}'):]
will remove the namespace.
回答5:
If you try to print the root, you will see something like this: http://www.host.domain/path/to/your/xml/namespace}RootTag' at 0x0000000000558DB8>
So, to avoid the ns0 prefix, you have to change the default namespace before parsing the XML data as below:
ET.register_namespace('', "http://www.host.domain/path/to/your/xml/namespace")
来源:https://stackoverflow.com/questions/8983041/saving-xml-files-using-elementtree