Node.getTextContent() is there a way to get text content of the current node, not the descendant's text

后端 未结 4 808
眼角桃花
眼角桃花 2021-02-02 14:53

Node.getTextContent() returns the text content of the current node and its descendants.

is there a way to get text content of the current node, not the descendant\'s tex

4条回答
  •  耶瑟儿~
    2021-02-02 15:37

    What you want is to filter children of your node to only keep ones with node type Node.TEXT_NODE.

    This is an example of method that will return you the desired content

    public static String getFirstLevelTextContent(Node node) {
        NodeList list = node.getChildNodes();
        StringBuilder textContent = new StringBuilder();
        for (int i = 0; i < list.getLength(); ++i) {
            Node child = list.item(i);
            if (child.getNodeType() == Node.TEXT_NODE)
                textContent.append(child.getTextContent());
        }
        return textContent.toString();
    }
    

    Within your example it means:

    String str = "" + //
            "XML" + //
            " is a " + //
            "browser based XML editor" + //
            "editor allows users to edit XML data in an intuitive word processor." + //
            "";
    Document domDoc = null;
    try {
        DocumentBuilderFactory docFactory = DocumentBuilderFactory.newInstance();
        DocumentBuilder docBuilder = docFactory.newDocumentBuilder();
        ByteArrayInputStream bis = new ByteArrayInputStream(str.getBytes());
        domDoc = docBuilder.parse(bis);
    } catch (Exception e) {
        e.printStackTrace();
    }
    DocumentTraversal traversal = (DocumentTraversal) domDoc;
    NodeIterator iterator = traversal.createNodeIterator(domDoc.getDocumentElement(), NodeFilter.SHOW_ELEMENT, null, true);
    for (Node n = iterator.nextNode(); n != null; n = iterator.nextNode()) {
        String tagname = ((Element) n).getTagName();
        System.out.println(tagname + "=" + getFirstLevelTextContent(n));
    }
    

    Output:

    paragraph= is a editor allows users to edit XML data in an intuitive word processor.
    link=XML
    strong=browser based XML editor
    

    What it does is iterating on all the children of a Node, keeping only TEXT (thus excluding comments, node and so on) and accumulating their respective text content.

    There is no direct method in Node or Element to get only the text content at first level.

提交回复
热议问题