Jsoup - extracting text

后端 未结 4 1905
日久生厌
日久生厌 2020-12-18 01:10

I need to extract text from a node like this:

Some text with tags might go here.

Also there are paragraphs<

4条回答
  •  独厮守ぢ
    2020-12-18 01:19

    Assuming you want text only (no tags) my solution is below.
    Output is:
    Some text with tags might go here. Also there are paragraphs. More text can go without paragraphs

    public static void main(String[] args) throws IOException {
        String str = 
                    "
    " + " Some text with tags might go here." + "

    Also there are paragraphs.

    " + " More text can go without paragraphs
    " + "
    "; Document doc = Jsoup.parse(str); Element div = doc.select("div").first(); StringBuilder builder = new StringBuilder(); stripTags(builder, div.childNodes()); System.out.println("Text without tags: " + builder.toString()); } /** * Strip tags from a List of type Node * @param builder StringBuilder : input and output * @param nodesList List of type Node */ public static void stripTags (StringBuilder builder, List nodesList) { for (Node node : nodesList) { String nodeName = node.nodeName(); if (nodeName.equalsIgnoreCase("#text")) { builder.append(node.toString()); } else { // recurse stripTags(builder, node.childNodes()); } } }

提交回复
热议问题