I have tried many of the Perl XML Parsers. I was quite interested in the Sablotron Parser, but it is such a pain to install on a Windows box. Currently I have started usin
I'll offer one that SHOULD NOT be used: XML::Parser.
It automatically expands HTML entities to their UTF-8 equivalents, and the option to disable this behavior does not work on the most characteristic of all entities, &.
Additionally, its XMLDecl-parser will interpret and display the standalone attribute in the block as "standalone"="1", which is absolutely incorrect -- it should be "standalone"="yes".