Parsing XML for deeply nested data

时光怂恿深爱的人放手 提交于 2020-01-17 08:33:32

问题


I have an XML file that is structured something like this:

<element1>
    <element2>
        <element3>
            <elementIAmInterestedIn attribute="data">
                <element4>
                    <element5>
                        <element6>
                            <otherElementIAmInterestedIn>
                                <data1>text1</data1>
                                <data2>text2</data2>
                                <data3>text3</data3>
                            </otherElementIAmInterestedIn>
                        </element6>
                    </element5>
                </element4>
            </elementIAmInterestedIn>
            <elementIAmInterestedIn attribute="data">
                <element4>
                    <element5>
                        <element6>
                            <otherElementIAmInterestedIn>
                                <data1>text1</data1>
                                <data2>text2</data2>
                                <data3>text3</data3>
                            </otherElementIAmInterestedIn>
                        </element6>
                    </element5>
                </element4>
            </elementIAmInterestedIn>
            <elementIAmInterestedIn attribute="data">
                <element4>
                    <element5>
                        <element6>
                            <otherElementIAmInterestedIn>
                                <data1>text1</data1>
                                <data2>text2</data2>
                                <data3>text3</data3>
                            </otherElementIAmInterestedIn>
                        </element6>
                    </element5>
                </element4>
            </elementIAmInterestedIn>
        </element3>
    </element2>
</element1>

As you can see, I am interested in two elements, the first of which is deeply nested within the root element, and the second of which is deeply nested within that first element. There are multiple (sibling) elementIAmInterestedIn and otherElementIAmInterestedIn elements in the document.

I want to parse this XML file with Java and put the data from all the elementIAmInterestedIn and otherElementIAmInterestedIn elements into either a data structure or Java objects - it doesn't matter much to me as long as it is organized and I can access it later.

I'm able to write a recursive DOM parser method that does a depth-first traversal of the XML so that it touches every element. I also wrote a Java class with JAXB annotations that represents elementIAmInterestedIn. Then, in the recursive method, I can check when I get to an elementIAmInterestedIn and unmarshal it into an instance of the JAXB class. This works fine except that such an object should also contain multiple otherElementIAmInterestedIn.

This is where I'm stuck. How can I get the data out of otherElementIAmInterestedIn and assign it to the JAXB object? I've seen the @XmlWrapper annotation, but this seems to only work for one layer of nesting. Also, I cannot use @XmlPath.

Maybe I should scratch that idea and use a whole new approach. I'm really just getting started with XML parsing so perhaps I'm overlooking a more obvious solution. How would you parse an XML document structured like this and store the data in an organized way?


回答1:


Maybe you should use SAX parser instead of DOM. When you use DOM you are loading all the document in memory and in your case you only want to read 2 fields. This is quite inefficient.

Using sax parser you'll be able to read only those nodes that you are interested in. Here is a pseudocode for your task using a SAX parsing model:

1) Keep reading nodes until you get <elementInterestedIn> node

2) Grab that field in your class

3) Keep on reading until you get <otherElementInterestedIn> node

4) Grab that field too and save the object.

Loop from 1 to 4 until it reachs the end of document.

If you try this aproach, i suggest you first reading this document to understand how SAX parser works, it's very different from DOM aproach: How to Use SAX



来源:https://stackoverflow.com/questions/17056412/parsing-xml-for-deeply-nested-data

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!