Feedparser - retrieve old messages from Google Reader

后端 未结 2 2015
梦如初夏
梦如初夏 2020-12-14 14:06

I\'m using the feedparser library in python to retrieve news from a local newspaper (my intent is to do Natural Language Processing over this corpus) and would like to be ab

2条回答
  •  遥遥无期
    2020-12-14 14:16

    To expand on Bartek's answer: You could also start storing all of the entries in the feed that you've already seen, and build up your own historical archive of the feed's content. This would delay your ability to start using it as a corpus (because you'd have to do this for a month to build up a collection of a month's worth of entries), but you wouldn't be dependent on anyone else for the data.

    I may be mistaken, but I'm pretty sure that's how Google Reader can go back in time: They have each feed's past entries stored somewhere.

提交回复
热议问题