Python praw reddit api: Reliably get posts as they are posted

最后都变了- 提交于 2021-02-08 04:07:37

问题


At the moment I have a script that queries some subreddits every 30 seconds and returns the newest submission:

while True:

    for post in reddit.subreddit(query_list).new(limit=1):

        if previous != post:
             # Do something

        previous = post

    time.sleep(30)

The problem with this is that if there are more than two posts in that time frame it'll skip one of them. I know I can set a smaller wait time, or I can get more than one post at a time and sort through the results, but that doesn't really fix the problem, it just makes it less likely.

What I would much rather do, is 'subscribe' to a feed by having a continuously open connection that receives posts as they are posted. Does this exist? And if not, is there another solution I haven't thought of?

(I realise what I'm talking about would put a large strain on the reddit api servers, so it probably doesn't exist, but I thought it was worth asking just in case)


回答1:


Yes, this exists in PRAW and it's called stream. Your entire code block can be replaced with the following:

for post in reddit.subreddit(query_list).stream.submissions():
    # Do something

You can stream subreddit comments by replacing submissions with comments.

Other models can be streamed as well, such as Multireddit and Redditor.




回答2:


The root of your problem is that you're limiting your results to only one post. In reality, what you want is every post since the last one you saw. Try something like this:

while True:
    for post in reddit.subreddit(query_list).new():
        if not newLastProcessed:
            newLastProcessed = post
        if post == lastProcessed:
            break
        # Do something

    lastProcessed = newLastProcessed
    newLastProcessed = None
    time.sleep(30)

Another alternative that's not quite as fragile in regards to ordering is to store the ids of processed posts, for instance in an SQLite database, and then query that for every post you're considering processing.



来源:https://stackoverflow.com/questions/49743797/python-praw-reddit-api-reliably-get-posts-as-they-are-posted

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!