问题
I'm trying to get just the top level text and none of the child text. So I have the following xml:
<job>
text1
<input> text2 </input>
</job>
and I would like to only get the parent(text1) text. So in this example I would do
node.getTextContent();
and get text1, not text1text2 which getTextContent is currently giving me. Now I've read the man pages and I know they say that getTextContent returns the concatenated string of all the children with the parent. But I would just like the text from the parent. Another way I was thinking about doing it was to try and isolate the parent from the children and do the getTextContent command on just the parent but I don't know how feasible that is.
Any help would be appreciated
Thanks, -Josh
回答1:
Iterate through all the children of the node and concatenate those that are text nodes. Either that or XPath.
回答2:
Does getChildNodes() work? if so you could loop over all the childNodes and call getContent() on them, and subtract that out of your node.getContent(). This would result in the text that isn't part of a sub-node.
Best answer: don't mix text with sub-nodes. I had to double-check that the xml you provided is even legal, it is, but it scares me.
回答3:
I think you could probably use an xpath of job/text() this might be easier than navigating the DOM model.
If you can, avoid mixed content, its a bit of a pain to work with.
回答4:
Instead of this
node.getTextContent();
use this:
if (node.getFirstNode() != null)
{
node.getFirstChild().getTextContent();
}
回答5:
node.firstChild.textContent.trim();
回答6:
If anyone is having problems with this the best way I found to do it was to get all the child nodes of the node and then get the node type of each child node. If you get a text node call getTextContent() on that node and there you go!
来源:https://stackoverflow.com/questions/4695167/how-to-get-only-the-top-level-nodes-text-content-with-gettextcontent