I\'m using jsoup to parse an rss feed using java. I\'m having problems getting a result when trying to select the first element in the document.
Refer here. Jsoup added this XmlParser.
try {
String xml = "<rss></rss><channel></channel><link>http://www.the.blog/category</link><title>The Blog Title</title>";
Document doc = Jsoup.parse(xml, "", Parser.xmlParser());
Element title = doc.select("title").first();
System.out.println(title.text());
Element link = doc.select("link").first();
System.out.println(link.text());
} catch (Exception e) {
e.printStackTrace();
}
Your rss feed is XML, not HTML. For this to work, you must tell JSoup to use its XMLParser. This will work:
String rss = "<?xml version=\"1.0\" encoding=\"UTF-8\"?>"
+"<rss><channel>"
+ "<title>The Blog Title</title>"
+ "<link>http://www.the.blog/category</link>"
+"</channel></rss>";
Document doc = Jsoup.parse(rss, "", Parser.xmlParser());
Element link = doc.select("rss channel link").first();
System.out.println(link.text()); // prints empty string
Explanation:
The link tag in HTML follows a different format and Jsoup tries to interpret the <link>
of your rss as such html tag.