问题
<person>
<firstname>
<lastname>
<salary>
</person>
This is the XML I am parsing. When I try printing the node names of child elements of person, I get
textfirstname
textlastname
textsalary
How do I eliminate #text being generated?
Update - Here is my code
try {
NodeList nl = null;
int l, i = 0;
File fXmlFile = new File("file.xml");
DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();
dbFactory.setValidating(false);
dbFactory.setIgnoringElementContentWhitespace(true);
dbFactory.setNamespaceAware(true);
dbFactory.setIgnoringComments(true);
dbFactory.setCoalescing(true);
InputStream in;
in = new FileInputStream(fXmlFile);
Document doc = dBuilder.parse(in);
doc.getDocumentElement().normalize();
Node n = doc.getDocumentElement();
System.out.println(dbFactory.isIgnoringElementContentWhitespace());
System.out.println(n);
if (n != null && n.hasChildNodes()) {
nl = n.getChildNodes();
for (i = 0; i < nl.getLength(); i++) {
System.out.println(nl.item(i).getNodeName());
}
}
} catch (Exception e) {
e.printStackTrace();
}
回答1:
setIgnoringElementContentWhitespace only works if you use setValidating(true), and then only if the XML file you are parsing references a DTD that the parser can use to work out which whitespace-only text nodes are actually ignorable. If your document doesn't have a DTD it errs on the safe side and assumes that no text nodes can be ignored, so you'll have to write your own code to ignore them as you traverse the child nodes.
来源:https://stackoverflow.com/questions/12817018/getnodename-operation-on-an-xml-node-returns-text