How can I remove the whitespaces and line breaks in an XML string in Python 2.6? I tried the following packages:
etree: This snippet keeps the original whitespaces:<
Here's something quick I came up with because I didn't want to use lxml:
from xml.dom import minidom
from xml.dom.minidom import Node
def remove_blanks(node):
for x in node.childNodes:
if x.nodeType == Node.TEXT_NODE:
if x.nodeValue:
x.nodeValue = x.nodeValue.strip()
elif x.nodeType == Node.ELEMENT_NODE:
remove_blanks(x)
xml = minidom.parse('file.xml')
remove_blanks(xml)
xml.normalize()
with file('file.xml', 'w') as result:
result.write(xml.toprettyxml(indent = ' '))
Which I really only needed to re-indent the XML file with otherwise broken indentation. It doesn't respect the preserve
directive, but, honestly, so do so many other software dealing with XMLs, that it's rather a funny requirement :) Also, you'd be able to easily add that sort of functionality to the code above (just check for space
attribute, and don't recure if its value is 'preserve'.)