Does anyone have/make/sell an error tolerant XML reader for .NET?
Yeah, I know, XML isn't designed to have errors in it and should be rejected if it's not valid .. blah blah. But sadly the real-world is imperfect and developers do make mistakes and I still want to be able to read their feeds even if I'm missing the odd element here or there because it wasn't encoded properly or had some other error in it. So please, no answers "fix the source" or "reject it".
So, does anyone have a component that can recover and handle common mistakes in XML files?
Look around HTML Parser, 'cause html is almost xml
It's precisely because the real world is imperfect that XML is so widely used. What would be the functional specification for an error-tolerant XML parser? It's an open-ended problem. It's hard enough to parse all variations of well-formed XML without trying to second-guess all possible errors.
[... Waits for downvote.]
Run the XML through Beautiful Soup first. That will clean your XML of errors so it parses correctly
For the specific case of an RSS feed and the specific case of individual corrupt item entries, you can use XmlTextReader to manually read in each item separately, handling the XmlException for invalid items. When an Exception occurs, you'll need to use a new Reader instance, as the original Reader is hosed. You'll still have to have valid <item>
and </item>
tags to identify each item, but you'll be able to recover from corrupt data within each item.
yes, I know it's old question, but recently I was looking for tolerant xml parser and found the following: XmlParser.
A Roslyn-inspired full-fidelity XML parser with no dependencies and a simple Visual Studio XML language service.
The parser produces a full-fidelity syntax tree, meaning every character of the source text is represented in the tree. The tree covers the entire source text. The parser has no dependencies and can easily be made portable.
You can add Nugets in your project. I tried this parser and it can read any XML files.
来源:https://stackoverflow.com/questions/4081425/error-tolerant-xml-reader