How to get text from this html tag by using jsoup?

前端 未结 3 844
醉酒成梦
醉酒成梦 2020-12-19 09:32

I meet a position when i using jsoup to extracting data. The data like this:

This is a strong number 2013

        
相关标签:
3条回答
  • 2020-12-19 09:44
    Document doc = Jsoup.parse("This is a <strong>strong</strong> number <date>2013</date>");
    
    Spanned HtmlDoc = Html.fromHtml(doc.toString());
    String fromHTML = HtmlDoc.toString();
    
    System.out.println(fromHTML);
    
    0 讨论(0)
  • 2020-12-19 09:56

    You can parse the html into a Document, select the body-Element and get its text.

    Example:

    Document doc = Jsoup.parse("This is a <strong>strong</strong> number <date>2013</date>");
    
    String ownText = doc.body().ownText();
    String text = doc.body().text();
    
    System.out.println(ownText);
    System.out.println(text);
    

    Output:

    This is a number  
    This is a strong number 2013
    
    0 讨论(0)
  • 2020-12-19 09:57

    This should answer your question :

    public String escapeHtml(String source) {
        Document doc = Jsoup.parseBodyFragment(source);
        Elements elements = doc.select("b");
        for (Element element : elements) {
            element.replaceWith(new TextNode(element.toString(),""));
        }
        return Jsoup.clean(doc.body().toString(), new Whitelist().addTags("a").addAttributes("a", "href", "name", "rel", "target"));
    }
    

    Jsoup - Howto clean html by escaping not deleting the unwanted html?

    0 讨论(0)
提交回复
热议问题