Error tolerant XML reader

a 夏天 提交于 2019-11-28 01:59:16

问题


Does anyone have/make/sell an error tolerant XML reader for .NET?

Yeah, I know, XML isn't designed to have errors in it and should be rejected if it's not valid .. blah blah. But sadly the real-world is imperfect and developers do make mistakes and I still want to be able to read their feeds even if I'm missing the odd element here or there because it wasn't encoded properly or had some other error in it. So please, no answers "fix the source" or "reject it".

So, does anyone have a component that can recover and handle common mistakes in XML files?


回答1:


Look around HTML Parser, 'cause html is almost xml




回答2:


It's precisely because the real world is imperfect that XML is so widely used. What would be the functional specification for an error-tolerant XML parser? It's an open-ended problem. It's hard enough to parse all variations of well-formed XML without trying to second-guess all possible errors.

[... Waits for downvote.]




回答3:


Run the XML through Beautiful Soup first. That will clean your XML of errors so it parses correctly




回答4:


For the specific case of an RSS feed and the specific case of individual corrupt item entries, you can use XmlTextReader to manually read in each item separately, handling the XmlException for invalid items. When an Exception occurs, you'll need to use a new Reader instance, as the original Reader is hosed. You'll still have to have valid <item> and </item> tags to identify each item, but you'll be able to recover from corrupt data within each item.




回答5:


yes, I know it's old question, but recently I was looking for tolerant xml parser and found the following: XmlParser.

A Roslyn-inspired full-fidelity XML parser with no dependencies and a simple Visual Studio XML language service.

The parser produces a full-fidelity syntax tree, meaning every character of the source text is represented in the tree. The tree covers the entire source text. The parser has no dependencies and can easily be made portable.

You can add Nugets in your project. I tried this parser and it can read any XML files.



来源:https://stackoverflow.com/questions/4081425/error-tolerant-xml-reader

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!