How to prevent System.Xml.XmlException: Invalid character in the given encoding

前端 未结 4 1790
心在旅途
心在旅途 2020-11-30 11:35

I have a Windows desktop app written in C# that loops through a bunch of XML files stored on disk and created by a 3rd party program. Most all the files are loaded and proce

4条回答
  •  时光说笑
    2020-11-30 11:44

    Because XmlDocument loads the entire thing as soon as it runs into an unencoded character it aborts the entire process. If you want to process what you can and skip/log duff bits, look at XmlTextReader. XmlTextReader loaded from a Filestream will load a node at a time, so it will also use a lot less memory. You could even get clever and split the thing up and parallelise the processing.

    When I've had this it's been things like accented characters in there: grave, acutes, umlauts, and such.

    I don't have any automated processes, so usually I just load the file in Visual Studio and edited the bad guys out until there are no squigglies left. The theory is sound though.

提交回复
热议问题