问题
I have made little test (with Jsoup 1.6.1):
String s = "" +Jsoup.parse("<td></td>").select("td").size();
System.out.println("Selected elements count : " + s);
It outputs:
Selected elements count : 0
But it should return 1, because I have parsed html with td element. What is wrong with my code or is there bug in Jsoup?
回答1:
Because Jsoup is a HTML5 compliant parser and you feeded it with invalid HTML. A <td>
has to go inside at least a <table>
.
int size = Jsoup.parse("<table><td></td></table>").select("td").size();
System.out.println("Selected elements count : " + size);
回答2:
String url = "http://foobar.com";
Document doc = Jsoup.connect(url).get();
Elements td = doc.select("td");
回答3:
Jsoup 1.6.2 allows to parse with different parser and simple XML parser is provided. With following code I could solve my problem. You can later parse your fragment with HTML parse, to get valid HTML.
// Jsoup 1.6.2
String s = "" + Jsoup.parse("<td></td>", "", Parser.xmlParser()).select("td").size();
System.out.println("Selected elements count : " + s);
来源:https://stackoverflow.com/questions/7985791/why-jsoup-cannot-select-td-element