I\'m trying to read a file (not a XML, but the structure is similar), but i\'m getting this Exception:
\'┴\', hexadecimal value 0x15, is an invalid character. Li
This XML is pretty bad;
<Segment>0000016125
in there which, while not technically illegal (it is a Text node), is just kind of odd.<Control>
element contains invalid characters without an XML CDATA
sectionYou can manually normalize the XML or do it in C# via string manipulation, or RegEx, or something similar.
In your simple example, only the <Control>
element has invalid characters; therefore it is relatively simple to fix it and add a CDATA
section using the string.Replace()
method, to make it look like this:
<Control><![CDATA[0003┴300000┴English(U.S.)PORTUGUESE┴┴bla.000┴webgui\messages\xsl\en\blabla\blabla.xlf]]></Control>
Then you can load the good XML into your XDocument
using XDocument.Parse(string xml)
:
string badXml = @"
<temproot>
<Codepage>UTF16</Codepage>
<Segment>0000016125
<Control>0003┴300000┴English(U.S.)PORTUGUESE┴┴bla.000┴webgui\messages\xsl\en\blabla\blabla.xlf</Control>
<Source>To blablablah the firewall to blablablah local IP address. </Source>
<Target>Para blablablah a uma blablablah local específico. </Target>
</Segment>
</temproot>";
// assuming only <control> element has the invalid characters
string goodXml = badXml
.Replace("<Control>", "<Control><![CDATA[")
.Replace("</Control>", "]]></Control>");
XDocument xDoc = XDocument.Parse(goodXml);
xDoc.Declaration = new XDeclaration("1.0", "utf-16", "yes");
// do stuff with xDoc