Reading from a stream with mixed XML and plain text

吃可爱长大的小学妹 提交于 2020-01-13 19:06:16

问题


I have a text stream that contains segments of both arbitrary plain text and well-formed xml elements. How can I read it and extract the xml elements only? XmlReader with ConformanceLevel set to Fragment still throws an exception when it encounters plain text, which to it is malformed xml.

Any ideas? Thanks

Here's my code so far:

XmlReaderSettings settings = new XmlReaderSettings();
settings.ConformanceLevel = ConformanceLevel.Fragment;

using (XmlReader reader = XmlReader.Create(stream, settings))
    while (!reader.EOF)
    {
        reader.MoveToContent();
        XmlDocument doc = new XmlDocument();
        doc.Load(reader.ReadSubtree());
        reader.ReadEndElement();
    }

Here's a sample stream content and I have no control over it by the way:

Found two objects:
Object a
<object>
    <name>a</name>
    <description></description>
</object>
Object b
<object>
    <name>b</name>
    <description></description>
</object>

回答1:


Provided that this is a hack, if you wrap your mixed document with a "fake" xml root node, you should be able to do what you need getting only the nodes of type element (i.e. skipping the text nodes) among the children of the root element:

using System;
using System.Linq;
using System.Xml;

static class Program {

    static void Main(string[] args) {

        string mixed = @"
Found two objects:
Object a
<object>
    <name>a</name>
    <description></description>
</object>
Object b
<object>
    <name>b</name>
    <description></description>
</object>
";
        string xml = "<FOO>" + mixed + "</FOO>";
        XmlDocument doc = new XmlDocument();
        doc.LoadXml(xml);
        var xmlFragments = from XmlNode node in doc.FirstChild.ChildNodes 
                           where node.NodeType == XmlNodeType.Element 
                           select node;
        foreach (var fragment in xmlFragments) {
            Console.WriteLine(fragment.OuterXml);
        }

    }

}


来源:https://stackoverflow.com/questions/11555534/reading-from-a-stream-with-mixed-xml-and-plain-text

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!