I'm trying to read the data from an XML file, validating it against the XSD it suggests, into a single data structure (such as XmlDocument). I have a solution, but it requires 2 passes through the file, and I'm wondering if there's a single-pass solution.
MyBooks.xml:
<Books xmlns:xsi='http://www.w3.org/2001/XMLSchema-instance'
xsi:noNamespaceSchemaLocation='books.xsd' id='999'>
<Book>Book A</Book>
<Book>Book B</Book>
</Books>
Books.xsd:
<xs:schema xmlns:xs='http://www.w3.org/2001/XMLSchema'
elementFormDefault='qualified'
attributeFormDefault='unqualified'>
<xs:element name='Books'>
<xs:complexType>
<xs:sequence>
<xs:element name='Book' type='xs:string' />
</xs:sequence>
<xs:attribute name='id' type='xs:unsignedShort' use='required' />
</xs:complexType>
</xs:element>
</xs:schema>
Let's say MyBooks.xml and Books.xsd are in the same directory.
Validate:
//Given a filename pointing to the XML file
var settings = new XmlReaderSettings();
settings.ValidationType = ValidationType.Schema;
settings.ValidationFlags |= XmlSchemaValidationFlags.ProcessInlineSchema;
settings.ValidationFlags |= XmlSchemaValidationFlags.ProcessSchemaLocation;
settings.ValidationFlags |= XmlSchemaValidationFlags.ReportValidationWarnings;
settings.CloseInput = true;
settings.ValidationEventHandler += new ValidationEventHandler(ValidationCB);
//eg:
//private static void ValidationCB(object sender, ValidationEventArgs args)
//{ throw new ApplicationException(args.Message); }
using(var reader = XmlReader.Create(filename, settings))
{ while(reader.Read()) ; }
Read into XmlDocument:
XmlDocument x = new XmlDocument();
x.Load(filename);
Sure, I could collect the nodes as the read from the XmlReader is taking place, but I'd rather not have to do it myself, if possible. Any suggestion?
Thanks in advance
You're very close with your solution; what you need to do is to use a validating reader to load your XML; this way the validation is done with your loading, in one pass; validation errors will not stop you from loading the document.
These are the high level steps that I usually use with a ValidateXml helper function; it all starts with a compiled XmlSchemaSet:
public bool ValidateXml(XmlSchemaSet xset)
I set the reader settings (which you did, too):
XmlReaderSettings settings = new XmlReaderSettings { ValidationType = ValidationType.Schema, Schemas = xset, ConformanceLevel = ConformanceLevel.Document };
settings.ValidationFlags |= XmlSchemaValidationFlags.ReportValidationWarnings;
// Use your helper class that collects validation events.
XsdUtils.Utils.SmartValidationHandler svh = new XsdUtils.Utils.SmartValidationHandler(Paschi.Xml.DefaultResolver.Instance);
settings.ValidationEventHandler += svh.ValidationCallbackOne;
Then I get a reader:
XmlReader xvr = XmlReader.Create(filename, settings);
Then I read the file, which brings the validation in:
XmlDocument xdoc = new XmlDocument();
xdoc.Load(xvr);
Your validation handler has the results now; one thing I also do is to ensure that the document element that was loaded, actually has a corresponding global element definition in the xml schema set.
XmlQualifiedName qn = XmlQualifiedName.Empty;
if (xdoc.DocumentElement != null)
{
if (string.IsNullOrEmpty(xdoc.DocumentElement.NamespaceURI))
{
qn = new XmlQualifiedName(xdoc.DocumentElement.LocalName);
}
else
{
qn = new XmlQualifiedName(xdoc.DocumentElement.LocalName, xdoc.DocumentElement.NamespaceURI);
}
}
return !(svh.HasError || qn.IsEmpty || (!xset.GlobalElements.Contains(qn)));
来源:https://stackoverflow.com/questions/9806346/single-pass-read-and-validate-xml-vs-referenced-xsd-in-c-sharp