问题
How to sort all tags in multigigabyte xml file alphabetically, all equal tags should also be sorted by attributes? All methods suggested in related questions fail for such large data.
I'm looking for existing tools for Windows or Linux.
回答1:
If you are using an XSLT to do the sorting, you can use the streaming-safe subset of XSLT with a streaming-enabled processor like Saxon. Saxon in streaming mode can easily manage gigabytes of input XML data.
The Saxon website has very detailed documentation about streaming XSLT templates.
回答2:
As the original goal was to be able to compare to extremely large xmls which contained similar data but in different order I ended up doing splitting xmls in logical chunks (each xml contained thousands of processed documents, and it was split so each document went into separate file with csplit utility), and then compared each pair of equally size documents from two xmls (luckily there were no equally sized documents within one xml).
Not perfect solution but it worked withing reasonable time and space constraints
来源:https://stackoverflow.com/questions/9095653/sort-multigigabyte-xml-file