Special characters with XDocument

后端 未结 1 1800
情歌与酒
情歌与酒 2021-01-23 09:57

I\'m trying to read a file (not a XML, but the structure is similar), but i\'m getting this Exception:

\'┴\', hexadecimal value 0x15, is an invalid character. Li         


        
1条回答
  •  半阙折子戏
    2021-01-23 10:07

    This XML is pretty bad;

    1. You have 0000016125 in there which, while not technically illegal (it is a Text node), is just kind of odd.
    2. Your element contains invalid characters without an XML CDATA section

    You can manually normalize the XML or do it in C# via string manipulation, or RegEx, or something similar.

    In your simple example, only the element has invalid characters; therefore it is relatively simple to fix it and add a CDATA section using the string.Replace() method, to make it look like this:

    
    

    Then you can load the good XML into your XDocument using XDocument.Parse(string xml):

    string badXml = @"
        
            UTF16
            0000016125
                0003┴300000┴English(U.S.)PORTUGUESE┴┴bla.000┴webgui\messages\xsl\en\blabla\blabla.xlf
                To blablablah the   firewall to blablablah local IP address.    
                Para blablablah a uma blablablah local específico.  
            
        ";
    
    // assuming only  element has the invalid characters
    string goodXml = badXml
        .Replace("", "", "]]>");
    
    XDocument xDoc = XDocument.Parse(goodXml);
    xDoc.Declaration = new XDeclaration("1.0", "utf-16", "yes");
    
    // do stuff with xDoc
    

    0 讨论(0)
提交回复
热议问题