问题
You can only retrieve 100 user objects per request with the
api.lookup_users()
method. Is there an easy way to retrieve more than 100 using Tweepy and Python? I have read this post: User ID to Username tweepy but it does not help with the more than 100 problem. I am pretty novice in Python so I cannot come up with a solution myself. What I have tried is this:
users = []
i = 0
num_pages = 2
while i < num_pages:
try:
# Look up a collection of ids
users.append(api.lookup_users(user_ids=ids[100*i:100*(i+1)-1]))
except tweepy.TweepError:
# We get a tweep error
print('Something went wrong, quitting...')
i = i + 1
where ids
is a list containing the ids, but I get IndexError: list index out of range
when I try to get a user with index higher than 100. If it helps I am only interested in getting the screen names from the user ids.
回答1:
You're right that you need to send the tweets to the API in batches of 100, but you're ignoring the fact that you might not have an exact multiple of 100 tweets. Try the following:
import tweepy
def lookup_user_list(user_id_list, api):
full_users = []
users_count = len(user_id_list)
try:
for i in range((users_count / 100) + 1):
full_users.extend(api.lookup_users(user_ids=user_id_list[i*100:min((i+1)*100, users_count)]))
return full_users
except tweepy.TweepError:
print 'Something went wrong, quitting...'
results = lookup_user_list(ids, api)
By taking the minimum of results = lookup_user_list(user_ids, main_api)
we ensure the final loop only gets the users left over. results
will be a list of the looked-up users.
You may also hit rate limits - when setting up your API, you should take care to let tweepy catch these gracefully and remove some of the hard work, like so:
consumer_key = 'X'
consumer_secret = 'X'
access_token = 'X'
access_token_secret = 'X'
auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_token_secret)
api = tweepy.API(auth, wait_on_rate_limit=True, wait_on_rate_limit_notify=True)
回答2:
I haven't tested it since I don't have access to the API.
But if you have a collection of user ids in any range, this should fetch all of them.
It fetches any remainder first, meaning if you have a list of 250 ids, it will fetch 50 users with the last 50 ids in the list.
Then it will fetch the remaining 200 users in batches of hundreds.
from tweepy import api, TweepError
users = []
user_ids = [] # collection of user ids
count_100 = int(len(user_ids) / 100) # amount of hundred user ids
if len(user_ids) % 100 > 0:
for i in range(0, count_100 + 1):
try:
if i == 0:
remainder = len(user_ids) % 100
users.append(api.lookup_users(user_ids=user_ids[:-remainder]))
else:
end_at = i * 100
start_at = end_at - 100
users.append(api.lookup_users(user_ids=user_ids[start_at:end_at]))
except TweepError:
print('Something went wrong, quitting...')
来源:https://stackoverflow.com/questions/43782889/tweepy-how-can-i-look-up-more-than-100-user-screen-names