RSS Reader isn't getting some tags

佐手、 提交于 2019-12-12 04:37:22

问题


This is a continue question from my previous question here :RSS Reader NullPointerException

In my app,in my list, sometimes i don't get the title of the RSS,and sometimes the description( and the image). The most strange here is that i don't have problem with all the links. For example,if i parse the link of the original tutorial (http://www.mobilenations.com/rss/mb.xml) everything works fine. But when i use an other link i have the above problem...

This is my DOMParser class:

package com.td.rssreader.parser;

import java.net.MalformedURLException;
import java.net.URL;
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import org.jsoup.Jsoup;
import org.jsoup.select.Elements;
import org.w3c.dom.Document;
import org.w3c.dom.Node;
import org.w3c.dom.NodeList;
import org.xml.sax.InputSource;

public class DOMParser {

    private RSSFeed _feed = new RSSFeed();

    public RSSFeed parseXml(String xml) {

        // _feed.clearList();

        URL url = null;
        try {
            url = new URL(xml);
        } catch (MalformedURLException e1) {
            e1.printStackTrace();
        }

        try {
            // Create required instances
            DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
            DocumentBuilder db = dbf.newDocumentBuilder();

            // Parse the xml
            Document doc = db.parse(new InputSource(url.openStream()));
            doc.getDocumentElement().normalize();

            // Get all <item> tags.
            NodeList nl = doc.getElementsByTagName("item");
            int length = nl.getLength();

            for (int i = 0; i < length; i++) {
                Node currentNode = nl.item(i);
                RSSItem _item = new RSSItem();

                NodeList nchild = currentNode.getChildNodes();
                int clength = nchild.getLength();

                // Get the required elements from each Item
                for (int j = 1; j < clength; j = j + 2) {

                    Node thisNode = nchild.item(j);
                    String theString = null;

                      if (thisNode != null && thisNode.getFirstChild() != null) {
                            theString = thisNode.getFirstChild().getNodeValue();
                        }


                    if (theString != null) {

                        String nodeName = thisNode.getNodeName();

                        if ("title".equals(nodeName)) {
                            // Node name is equals to 'title' so set the Node
                            // value to the Title in the RSSItem.
                            _item.setTitle(theString);
                        }

                        else if ("description".equals(nodeName)) {
                            _item.setDescription(theString);

                            // Parse the html description to get the image url
                            String html = theString;
                            org.jsoup.nodes.Document docHtml = Jsoup
                                    .parse(html);
                            Elements imgEle = docHtml.select("img");
                            _item.setImage(imgEle.attr("src"));
                        }
//description
                        else if ("pubDate".equals(nodeName)) {

                            // We replace the plus and zero's in the date with
                            // empty string
                            String formatedDate = theString.replace(" +0000",
                                    "");
                            _item.setDate(formatedDate);
                        }


                        if ("link".equals(nodeName)) {
                            // Node name is equals to 'title' so set the Node
                            // value to the Title in the RSSItem.
                            _item.setLink(theString);
                        }
                    }
                }

                // add item to the list
                _feed.addItem(_item);
            }

        } catch (Exception e) {
            e.printStackTrace();
        }

        // Return the final feed once all the Items are added to the RSSFeed
        // Object(_feed).
        return _feed;
    }

}

回答1:


Once you are looping the through the item nodes, you then have another loop that attempts to iterate through the child elements (to set title, description, etc).

But you loop is starting at index 1 and is increasing by 2:

// Get the required elements from each Item
            for (int j = 1; j < clength; j = j + 2) {

This means it only checks position 1,3,5,etc

Looking at the xml posted, that shows why you get different data each item. Set the loop to index at 0 and increase by just 1.



来源:https://stackoverflow.com/questions/15867104/rss-reader-isnt-getting-some-tags

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!