I am working with some XML that holds strings like:
This is a string
Some of the strings that I am passing to the
Another way to remove incorrect XML chars in C# is using XmlConvert.IsXmlChar (Available since .NET Framework 4.0)
public static string RemoveInvalidXmlChars(string content)
{
return new string(content.Where(ch => System.Xml.XmlConvert.IsXmlChar(ch)).ToArray());
}
or you may check that all characters are XML-valid:
public static bool CheckValidXmlChars(string content)
{
return content.All(ch => System.Xml.XmlConvert.IsXmlChar(ch));
}
.Net Fiddle
For example, the vertical tab symbol (\v
) is not valid for XML, it is valid UTF-8, but not valid XML 1.0, and even many libraries (including libxml2) miss it and silently output invalid XML.