Jsoup .select returns empty value but element does contains text

拥有回忆 提交于 2020-01-05 07:44:08

问题


I'm trying to get the text of "link" tag element in this xml: http://www.istana.gov.sg/latestupdate/rss.xml

I have coded to get the first article.

        URL = getResources().getString(R.string.istana_home_page_rss_xml);
        // URL = "http://www.istana.gov.sg/latestupdate/rss.xml";

        try {
            doc = Jsoup.connect(URL).ignoreContentType(true).get();
        } catch (IOException e) {
            // TODO Auto-generated catch block
            e.printStackTrace();
        }

        // retrieve the link of the article
        links = doc.select("link");

        // retrieve the publish date of the article
        dates = doc.select("pubDate");

        //retrieve the title of the article
        titles = doc.select("title");

        String[] article1 = new String[3];
        article1[0] = links.get(1).text();
        article1[1] = titles.get(1).text();
        article1[2] = dates.get(0).text();

The article comes out nicely but the link returns "" value (The whole entire link elements return "" value). The titles and dates have no problems. The link tag consist of a URL text. Anyone knows why it returns "" value?


回答1:


It looks like default HTML parser can't recognize <link> as valid tag and is automatically closing it <link /> which means that content of this tag is empty.

To solve this problem instead of HTML parser you can use XML parser which doesn't care that much about tag names.

doc = Jsoup.connect(URL)
      .ignoreContentType(true)
      .parser(Parser.xmlParser()) // <-- add this
      .get();


来源:https://stackoverflow.com/questions/27708009/jsoup-select-returns-empty-value-but-element-does-contains-text

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!