When creating an XML file with Python\'s etree, if we write to the file an empty tag using SubElement
, I get:
Adding an empty text
is another option:
etree.SubElement(parent, 'child_tag_name').text=''
But note that this will change not only the representation but also the structure of the document: i.e. child_el.text
will be ''
instead of None
.
Oh, and like Martijn said, try to use better libraries.
Paraphrasing the code, the version of ElementTree.py
I use contains the following in a _write
method:
write('<' + tagname)
...
if node.text or len(node): # this line is literal
write('>')
...
write('</%s>' % tagname)
else:
write(' />')
To steer the program counter I created the following:
class AlwaysTrueString(str):
def __nonzero__(self): return True
true_empty_string = AlwaysTrueString()
Then I set node.text = true_empty_string
on those ElementTree nodes where I want an open-close tag rather than a self-closing one.
By "steering the program counter" I mean constructing a set of inputs—in this case an object with a somewhat curious truth test—to a library method such that the invocation of the library method traverses its control flow graph the way I want it to. This is ridiculously brittle: in a new version of the library, my hack might break—and you should probably treat "might" as "almost guaranteed". In general, don't break abstraction barriers. It just worked for me here.
This was directly solved in Python 3.4. From then, the write
method of xml.etree.ElementTree.ElementTree
has the short_empty_elements
parameter which:
controls the formatting of elements that contain no content. If True (the default), they are emitted as a single self-closed tag, otherwise they are emitted as a pair of start/end tags.
More details in the xml.etree documentation.
If you have sed available, you could pipe the output of your python script to
sed -e "s/<\([^>]*\) \/>/<\1><\/\1>/g"
Which will find any occurence of <Tag />
and replace it by <Tag></Tag>
As of Python 3.4, you can use the short_empty_elements
argument for both the tostring() function and the ElementTRee.write() method:
>>> from xml.etree import ElementTree as ET
>>> ET.tostring(ET.fromstring('<mytag/>'), short_empty_elements=False)
b'<mytag></mytag>'
In older Python versions, (2.7 through to 3.3), as a work-around you can use the html
method to write out the document:
>>> from xml.etree import ElementTree as ET
>>> ET.tostring(ET.fromstring('<mytag/>'), method='html')
'<mytag></mytag>'
Both the ElementTree.write()
method and the tostring()
function support the method
keyword argument.
On even earlier versions of Python (2.6 and before) you can install the external ElementTree library; version 1.3 supports that keyword.
Yes, it sounds a little weird, but the html
output mostly outputs empty elements as a start and end tag. Some elements still end up as empty tag elements; specifically <link/>
, <input/>
, <br/>
and such. Still, it's that or upgrade your Fortran XML parser to actually parse standards-compliant XML!