Feedparser - retrieve old messages from Google Reader

后端未结

关注

 2  2029

梦如初夏 2020-12-14 14:06

I\'m using the feedparser library in python to retrieve news from a local newspaper (my intent is to do Natural Language Processing over this corpus) and would like to be ab

2条回答

遥遥无期 (楼主)

2020-12-14 14:16

To expand on Bartek's answer: You could also start storing all of the entries in the feed that you've already seen, and build up your own historical archive of the feed's content. This would delay your ability to start using it as a corpus (because you'd have to do this for a month to build up a collection of a month's worth of entries), but you wouldn't be dependent on anyone else for the data.

I may be mistaken, but I'm pretty sure that's how Google Reader can go back in time: They have each feed's past entries stored somewhere.

0 讨论(0)

查看其它2个回答
发布评论:

提交评论
- 加载中...