I\'m trying to send an XML document over the wire but receiving the following exception:
\"MY LONG EMAIL STRING\" was specified for the \'Body\' element. ---
I'm on the receiving end of @parapurarajkumar's solution, where the illegal characters are being properly loaded into XmlDocument, but breaking XmlWriter when I'm trying to save the output.
My Context
I'm looking at exception/error logs from the website using Elmah. Elmah returns the state of the server at the time of the exception, in the form of a large XML document. For our reporting engine I pretty-print the XML with XmlWriter.
During a website attack, I noticed that some xmls weren't parsing and was receiving this '.', hexadecimal value 0x00, is an invalid character. exception.
NON-RESOLUTION: I converted the document to a byte[] and sanitized it of 0x00, but it found none.
When I scanned the xml document, I found the following:
...
There was the nul byte encoded as an html entity !!!
RESOLUTION: To fix the encoding, I replaced the value before loading it into my XmlDocument, because loading it will create the nul byte and it will be difficult to sanitize it from the object. Here's my entire process:
XmlDocument xml = new XmlDocument();
details.Xml = details.Xml.Replace("", "[0x00]"); // in my case I wanted to see it, otherwise just replace with ""
xml.LoadXml(details.Xml);
string formattedXml = null;
// I stuff this all in a helper function, but put it in-line for this example
StringBuilder sb = new StringBuilder();
XmlWriterSettings settings = new XmlWriterSettings {
OmitXmlDeclaration = true,
Indent = true,
IndentChars = "\t",
NewLineHandling = NewLineHandling.None,
};
using (XmlWriter writer = XmlWriter.Create(sb, settings)) {
xml.Save(writer);
formattedXml = sb.ToString();
}
LESSON LEARNED: sanitize for illegal bytes using the associated html entity, if your incoming data is html encoded on entry.