My input file is actually multiple XML files appending to one file. (It\'s from Google Patents). It has below structure:
I don't know about minidom, nor much about XML parsing, but I have used XPath to parse XML/HTML. E.g. within the lxml module.
Here you can find some XPath Examples: http://www.w3schools.com/xpath/xpath_examples.asp