Is there any way to find if the given URLs is an RSS feed or atom using Java?

 ̄綄美尐妖づ 提交于 2019-12-24 23:01:37

问题


I am writing an RSS parser. Is there any way to find if the given URL is RSS or atom using Java?


回答1:


You could use ROME (I suggest that first) for parsing RSS and Atom Feeds. Alternatively, you'll have to use a SAX parser or create a DOM tree and do the following:

For RSS:
In RSS, you will have to check that there's a rss element, and it's child must contain a channel element. There can be 0 or more item in RSS (I might be wrong).

Example:

<?xml version="1.0" encoding="UTF-8" ?>
<rss version="2.0">
<channel>
    <title>RSS Title</title>
    <description>This is an example of an RSS feed</description>
    <link>http://www.someexamplerssdomain.com/main.html</link>
    <lastBuildDate>Mon, 06 Sep 2010 00:01:00 +0000 </lastBuildDate>
    <pubDate>Mon, 06 Sep 2009 16:45:00 +0000 </pubDate>

    <item>
        <title>Example entry</title>
        <description>Here is some text containing an interesting description of the thing to be described.</description>
        <link>http://www.wikipedia.org/</link>
        <guid>unique string per item</guid>
        <pubDate>Mon, 06 Sep 2009 16:45:00 +0000 </pubDate>
    </item>

</channel>
</rss>

For Atom:
In Atom, you will have to check that there's a feed element. There can be 0 or more entry in Atom. (I might be wrong).

Example:

<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">

    <title>Example Feed</title>
    <subtitle>A subtitle.</subtitle>
    <link href="http://example.org/feed/" rel="self" />
    <link href="http://example.org/" />
    <id>urn:uuid:60a76c80-d399-11d9-b91C-0003939e0af6</id>
    <updated>2003-12-13T18:30:02Z</updated>
    <author>
        <name>John Doe</name>
        <email>johndoe@example.com</email>
    </author>

    <entry>
        <title>Atom-Powered Robots Run Amok</title>
        <link href="http://example.org/2003/12/13/atom03" />
        <link rel="alternate" type="text/html" href="http://example.org/2003/12/13/atom03.html"/>
        <link rel="edit" href="http://example.org/2003/12/13/atom03/edit"/>
        <id>urn:uuid:1225c695-cfb8-4ebb-aaaa-80da344efa6a</id>
        <updated>2003-12-13T18:30:02Z</updated>
        <summary>Some text.</summary>
    </entry>

</feed>

PS: I don't know which RSS version or Atom version you want to implement, but follow their guidelines.

  • RSS
  • Atom
  • RSS 2.0 and Atom 1.0 compared


来源:https://stackoverflow.com/questions/3961182/is-there-any-way-to-find-if-the-given-urls-is-an-rss-feed-or-atom-using-java

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!