Why are there #text nodes in my xml file?

帅比萌擦擦* 提交于 2019-12-10 03:39:35

问题


I'm making an android application that does DOM parsing on an xml file. I have an xml file that looks like this:

<?xml version="1.0" encoding="utf-8"?>
<family>
    <grandparent>
        <parent1>
            <child1>Foo</child1>
            <child2>Bar</child2>
        </parent1>
        <parent2>
            <child1>Raz</child1>
            <child2>Mataz</child2>
        </parent2>
    </grandparent>  
</family>

If I run a dom parser on it, like this:

try {
    DocumentBuilder builder = DocumentBuilderFactory.newInstance().newDocumentBuilder();

    Document doc = builder.parse(input);
    doc.getDocumentElement().normalize();   //added in since the edit
    NodeList nodd = doc.getElementsByTagName("grandparent");
    for (int x = 0; x < nodd.getLength(); x++){
        Node node = nodd.item(x);
        NodeList nodes = node.getChildNodes();
        for(int y = 0; y < nodes.getLength(); y++){
            Node n = nodes.item(y);
            System.out.println(n.getNodeName());
        }
    }
}

My application prints out the following

07-20 18:24:28.395: INFO/System.out(491): #text

07-20 18:24:28.395: INFO/System.out(491): parent1

07-20 18:24:28.395: INFO/System.out(491): #text

07-20 18:24:28.395: INFO/System.out(491): parent2

07-20 18:24:28.395: INFO/System.out(491): #text

My question is, what are those #text fields and more importantly, how do I get rid of them?

Edit: So now that I know what they are, I tried to normalize it. I have updated the code to reflect the changes, but same result.


回答1:


It's whitespace (newlines, spaces, tabs) :)




回答2:


This is what you get :

1) A node list with all the nodes being the grand-parents

NodeList nodd = doc.getElementsByTagName("grandparent");

2) All the child node of the grand parent x

NodeList nodes = node.getChildNodes();

which are the sub nodes of

< grandparent >
    < parent1 >
       ...
    < /parent1 >

    < parent2 >
       ...
    < /parent2 >
< /grandparent >

3) The child y

nodes.item(y);

There could be text between and this is the #text you have, if you had :

< grandparent >
    yourTextHere1
    < parent1 >
       ...
    < /parent1 >
    yourTextHere2
    < parent2 >
       ...
    < /parent2 >
    yourTextHere3
< /grandparent >

You would get :

yourTextHere1 parent1 yourTextHere2 parent2 yourTextHere3

I hope it helped you ! Julien,




回答3:


Do this when parsing the document,

Document doc = builder.parse(input); 
doc.getDocumentElement().normalize();

This would kind of deflate the xml file and remove all unwanted #text children.



来源:https://stackoverflow.com/questions/6766721/why-are-there-text-nodes-in-my-xml-file

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!