feedparser

Feedparser - retrieve old messages from Google Reader

前提是你 提交于 2019-11-30 06:38:35
问题 I'm using the feedparser library in python to retrieve news from a local newspaper (my intent is to do Natural Language Processing over this corpus) and would like to be able to retrieve many past entries from the RSS feed. I'm not very acquainted with the technical issues of RSS, but I think this should be possible (I can see that, e.g., Google Reader and Feedly can do this ''on demand'' as I move the scrollbar). When I do the following: import feedparser url = 'http://feeds.folha.uol.com.br

Feedparser.parse() 'SSL: CERTIFICATE_VERIFY_FAILED'

瘦欲@ 提交于 2019-11-29 10:36:00
I'm having this SSL issue with feedparser parsing an HTTPS RSS feed, I don't really know what to do as I can't find any documentation on this error when it comes to feedparser: >>> import feedparser >>> feed = feedparser.parse(rss) >>> feed {'feed': {}, 'bozo': 1, 'bozo_exception': URLError(SSLError(1, u'[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:581)'),), 'entries': []} >>> feed["items"] [] >>> Thanks you cmidi for the answer, which was to 'monkey patch' using ssl._create_default_https_context = ssl._create_unverified_context import feedparser import ssl if hasattr(ssl

Feedparser - retrieve old messages from Google Reader

眉间皱痕 提交于 2019-11-28 20:56:01
I'm using the feedparser library in python to retrieve news from a local newspaper (my intent is to do Natural Language Processing over this corpus) and would like to be able to retrieve many past entries from the RSS feed. I'm not very acquainted with the technical issues of RSS, but I think this should be possible (I can see that, e.g., Google Reader and Feedly can do this ''on demand'' as I move the scrollbar). When I do the following: import feedparser url = 'http://feeds.folha.uol.com.br/folha/emcimadahora/rss091.xml' feed = feedparser.parse(url) for post in feed.entries: title = post

IncompleteRead using httplib

↘锁芯ラ 提交于 2019-11-26 14:00:28
问题 I have been having a persistent problem getting an rss feed from a particular website. I wound up writing a rather ugly procedure to perform this function, but I am curious why this happens and whether any higher level interfaces handle this problem properly. This problem isn't really a show stopper, since I don't need to retrieve the feed very often. I have read a solution that traps the exception and returns the partial content, yet since the incomplete reads differ in the amount of bytes

Parse RSS with jQuery

妖精的绣舞 提交于 2019-11-26 01:35:17
问题 I want to use jQuery to parse RSS feeds. Can this be done with the base jQuery library out of the box or will I need to use a plugin? 回答1: WARNING The Google Feed API is officially deprecated and doesn't work anymore ! No need for a whole plugin. This will return your RSS as a JSON object to a callback function: function parseRSS(url, callback) { $.ajax({ url: document.location.protocol + '//ajax.googleapis.com/ajax/services/feed/load?v=1.0&num=10&callback=?&q=' + encodeURIComponent(url),