Avoid Twitter API limitation with Tweepy

前端 未结 5 1420
时光说笑
时光说笑 2020-11-30 21:18

I saw in some question on Stack Exchange that the limitation can be a function of the number of requests per 15 minutes and depends also on the complexity of the algorithm,

相关标签:
5条回答
  • 2020-11-30 22:05

    If you want to avoid errors and respect the rate limit you can use the following function which takes your api object as an argument. It retrieves the number of remaining requests of the same type as the last request and waits until the rate limit has been reset if desired.

    def test_rate_limit(api, wait=True, buffer=.1):
        """
        Tests whether the rate limit of the last request has been reached.
        :param api: The `tweepy` api instance.
        :param wait: A flag indicating whether to wait for the rate limit reset
                     if the rate limit has been reached.
        :param buffer: A buffer time in seconds that is added on to the waiting
                       time as an extra safety margin.
        :return: True if it is ok to proceed with the next request. False otherwise.
        """
        #Get the number of remaining requests
        remaining = int(api.last_response.getheader('x-rate-limit-remaining'))
        #Check if we have reached the limit
        if remaining == 0:
            limit = int(api.last_response.getheader('x-rate-limit-limit'))
            reset = int(api.last_response.getheader('x-rate-limit-reset'))
            #Parse the UTC time
            reset = datetime.fromtimestamp(reset)
            #Let the user know we have reached the rate limit
            print "0 of {} requests remaining until {}.".format(limit, reset)
    
            if wait:
                #Determine the delay and sleep
                delay = (reset - datetime.now()).total_seconds() + buffer
                print "Sleeping for {}s...".format(delay)
                sleep(delay)
                #We have waited for the rate limit reset. OK to proceed.
                return True
            else:
                #We have reached the rate limit. The user needs to handle the rate limit manually.
                return False 
    
        #We have not reached the rate limit
        return True
    
    0 讨论(0)
  • 2020-11-30 22:07
    import tweepy
    auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
    auth.set_access_token(access_token, access_token_secret)
    # will notify user on ratelimit and will wait by it self no need of sleep.
    api = tweepy.API(auth, wait_on_rate_limit=True, wait_on_rate_limit_notify=True)
    
    0 讨论(0)
  • 2020-11-30 22:18

    The problem is that your try: except: block is in the wrong place. Inserting data into the database will never raise a TweepError - it's iterating over Cursor.items() that will. I would suggest refactoring your code to call the next method of Cursor.items() in an infinite loop. That call should be placed in the try: except: block, as it can raise an error.

    Here's (roughly) what the code should look like:

    # above omitted for brevity
    c = tweepy.Cursor(api.search,
                           q=search,
                           include_entities=True).items()
    while True:
        try:
            tweet = c.next()
            # Insert into db
        except tweepy.TweepError:
            time.sleep(60 * 15)
            continue
        except StopIteration:
            break
    

    This works because when Tweepy raises a TweepError, it hasn't updated any of the cursor data. The next time it makes the request, it will use the same parameters as the request which triggered the rate limit, effectively repeating it until it goes though.

    0 讨论(0)
  • 2020-11-30 22:19

    For anyone who stumbles upon this on Google, tweepy 3.2+ has additional parameters for the tweepy.api class, in particular:

    • wait_on_rate_limit – Whether or not to automatically wait for rate limits to replenish
    • wait_on_rate_limit_notify – Whether or not to print a notification when Tweepy is waiting for rate limits to replenish

    Setting these flags to True will delegate the waiting to the API instance, which is good enough for most simple use cases.

    0 讨论(0)
  • 2020-11-30 22:19

    Just replace

    api = tweepy.API(auth)
    

    with

    api = tweepy.API(auth, wait_on_rate_limit=True)
    
    0 讨论(0)
提交回复
热议问题