Removing text enclosed between HTML tags using JSoup

允我心安 提交于 2019-11-30 15:31:08

The Cleaner will always drop tags and preserve text. If you need to drop elements (i.e. tags and text / nested elements), you can pre-parse the HTML, remove the elements using either remove() or empty(), then run the resulting through the cleaner.

For example:

String html = "Clean <div>Text dropped</div>";
Document doc = Jsoup.parse(html);
doc.select("div").remove();
// if not removed, the cleaner will drop the <div> but leave the inner text
String clean = Jsoup.clean(doc.body().html(), Whitelist.basic());
NomanJaved
1.     String html = "<!DOCTYPE html><html><head><title></title></head><body><p>hello there</p></body></html>";
2.      Document d = Jsoup.parse(html);
3.      System.out.println(d);
4.      System.out.println("************************************************");
5.      d.getElementsByTag("p").remove();
6.      System.out.println(d);

while you getting with Elements you getting some trouble you can do this action on Document d object. that will work accurate.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!