问题
Given the following source code:
using System;
using System.IO;
using System.Xml;
using System.Xml.Schema;
namespace TheXMLGames
{
class Program
{
static void Main(string[] args)
{
XmlReaderSettings settings = new XmlReaderSettings {
Async = false,
ConformanceLevel = ConformanceLevel.Fragment,
DtdProcessing = DtdProcessing.Ignore,
ValidationFlags = XmlSchemaValidationFlags.None,
ValidationType = ValidationType.None,
XmlResolver = null,
};
string head = File.ReadAllText("sample.xml");
Stream stringStream = GenerateStreamFromString(head);
// Variant 1
//XmlReader reader = XmlReader.Create(stringStream);
// Variant 2
//XmlReader reader = XmlReader.Create(stringStream, settings);
// Variant 3
XmlTextReader reader = new XmlTextReader(stringStream);
while (reader.Read())
if (reader.NodeType != XmlNodeType.Whitespace)
Console.WriteLine(reader.Name + ": " + reader.Value);
// No Variant gets here without an exception,
// but that's not the point!
Console.ReadKey();
}
public static Stream GenerateStreamFromString(string s)
{
MemoryStream stream = new MemoryStream();
StreamWriter writer = new StreamWriter(stream);
writer.Write(s);
writer.Flush();
stream.Position = 0;
return stream;
}
}
}
sample.xml
<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE TestingFacility >
<TestingFacility id="MACHINE_2015-11-11T11_11_11" version="2015-11-11">
<Program>
<Title>title</Title>
<Steps>16</Steps>
</Program>
<Calibration>
<Current offset="0" gain="111.11" />
<Voltage offset="0" gain="111.11" />
</Calibration>
<Info type="Facilityname" value="MACHINE" />
<Info type="Hardwareversion" value="HW11" />
<Info type="Account" value="DJohn" />
<Info type="Teststart" value="2015-11-11T11:11:11" />
<Info type="Description" value="desc" />
<Info type="Profiler" value="prof" />
<Info type="Target" value="trgt" />
The behaviour is the following:
Variant 1
XmlReader.Create(stream)
An unhandled exception of type 'System.Xml.XmlException' occurred in System.Xml.dll
Additional information: For security reasons DTD is prohibited in this XML document. To enable DTD processing set the DtdProcessing property on XmlReaderSettings to Parse and pass the settings into XmlReader.Create method.
Variant 2
XmlReader.Create(stream, settings)
An unhandled exception of type 'System.Xml.XmlException' occurred in System.Xml.dll
Additional information: Unexpected DTD declaration. Line 2, position 3.
Variant 3
new XmlTextReader(stringStream)
An unhandled exception of type 'System.Xml.XmlException' occurred in System.Xml.dll
Additional information: Unexpected end of file has occurred. The following elements are not closed: TestingFacility. Line 19, position 36.
Variant 1 and 2 throw after the first line.
Variant 3 outputs the whole file as expected and when it gets to the end, it complains (correctly!).
The software works as I obviously use Variant 3, but the (now) recommended way is to use the Factory via XmlReader.Create
If I fiddle with the settings, it starts getting even more weird.
How can I get the code up-to-date and use XmlReader.Create?
The full project can be found here: https://drive.google.com/file/d/0B55cC50M31_8T0lub25oS2QxQ00/view
回答1:
I don't normally recommend using a NON XML method for parsing an XML file. But sometimes when the XML is not valid other methods are a better choice. Since you have a huge XML file and you are only trying to get one line of data the code below may be the best choice.
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Data;
using System.IO;
using System.Text.RegularExpressions;
namespace ConsoleApplication1
{
class Program
{
static void Main(string[] args)
{
string xml =
"<?xml version=\"1.0\" encoding=\"utf-8\"?>\n" +
"<!DOCTYPE SomeDocTypeIdidntPutThere>\n" +
"<TestingFacility id=\"MACHINE2_1970-01-01T11_22_33\" version=\"1970-01-01\">\n" +
"<Program>\n" +
"<Title>Fancy Title</Title>\n" +
"<Steps>136</Steps>\n" +
"</Program>\n" +
"<Info type=\"Start\" value=\"2070-01-01T11:22:33\" />\n" +
"<Info type=\"LotMoreOfThem\" value=\"42\" />\n";
MemoryStream stream = new MemoryStream(Encoding.UTF8.GetBytes(xml));
StreamReader reader = new StreamReader(stream);
string inputLine = "";
string timeStr = "";
while ((inputLine = reader.ReadLine()) != null)
{
inputLine = inputLine.Trim();
if(inputLine.StartsWith("<Info type=\"Start\""))
{
string pattern = "value=\"(?'time'[^\"]+)";
timeStr = Regex.Match(inputLine, pattern).Groups["time"].Value;
break;
}
}
DateTime time;
if (timeStr.Length > 0)
{
time = DateTime.Parse(timeStr);
}
}
}
}
回答2:
You xml is invalid. You are missing the closing tag and the DOCTYPE must match the root tag
<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE TestingFacility>
<TestingFacility id="MACHINE2_1970-01-01T11_22_33" version="1970-01-01">
<Program>
<Title>Fancy Title</Title>
<Steps>136</Steps>
</Program>
<Info type="Start" value="2070-01-01T11:22:33" />
<Info type="LotMoreOfThem" value="42" />
</TestingFacility>
来源:https://stackoverflow.com/questions/34257204/different-behaviour-between-new-xmltextreader-and-xmlreader-create