Efficient way to read large XML into dfferent node types in C#

浪尽此生 提交于 2021-02-15 07:53:27

问题


I am new to C#. I have a relatively large XML file (28MB) and am trying to parse its subtrees into several different types based on their content. Essentially, I have 6900+ Content nodes that all have to be interrogated to figure out what type they are.

<Collections>
    <Content>..</Content>
    <Content>..</Content>
    <Content>..</Content>
    ...
</Collections>

For each Content node, the variety of nodes below it can have 1 of 3 different patterns. I have to look into the node to decide which pattern/type of object I am looking at.

So imagine a Content node that has about 100 subnodes in it, and the 14th node (in one case) has a URL in it and indicates it is a "type 1" and should have fields 1, 2, 3,...17, 28, 47 and 58 written to the DB.

Another type has an indicative pair of elements (let's say element 3 and 58) and indicates it is a "type 2" and should have a different set of elements written to the DB.

And so on...

From there, I map the objects into a CMS/DB and connect various bits of data to fields in that other system and write data from the pertinent elements over to the DB.

Since the source file is large, I would love to efficiently pull subtrees out of the larger file, zip up and down them (do decide on their types) and then wirte the important data (map them) over to the DB.

Do I have to store the values along the way somehow and decide after I have stored them, what sort of object this is?

I am struggling with the forward only approach of XmlReader and the ease of using a DOM based approach.

Thanks for the advice.

===edit==== Thank you commenters. The structure inside of the Content nodes would have 1 of 3 patterns in it. There are about 100 nodes in each type, so I did not bother pasting them in for readability's sake. I did try and clarify above though.


回答1:


With large files you must use xmlreader. I prefer using combination of xmlreeader and xml linq. Try following :

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Xml;
using System.Xml.Linq;

namespace ConsoleApplication1
{
    class Program
    {
        const string FILENAME = @"c:\temp\test.xml";
        static void Main(string[] args)
        {
            XmlReader reader = XmlReader.Create(FILENAME);
            while (!reader.EOF)
            {
                if (reader.Name != "Content")
                {
                    reader.ReadToFollowing("Content");
                }
                if (!reader.EOF)
                {
                    XElement content = (XElement)XElement.ReadFrom(reader);
                }
            }
        }
    }
}


来源:https://stackoverflow.com/questions/40456446/efficient-way-to-read-large-xml-into-dfferent-node-types-in-c-sharp

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!