How can I get tweets older than a week (using tweepy or other python libraries)

[亡魂溺海] 提交于 2019-11-26 18:42:07

You cannot use the twitter search API to collect tweets from two years ago. Per the docs:

Also note that the search results at twitter.com may return historical results while the Search API usually only serves tweets from the past week. - Twitter documentation.

If you need a way to get old tweets, you can get them from individual users because collecting tweets from them is limited by number rather than time (so in many cases you can go back months or years). A third-party service that collects tweets like Topsy may be useful in your case as well (shut down as of July 2016, but other services exist).

As you have noticed Twitter API has some limitations, I have implemented a code that do this using the same strategy as Twitter running over a browser. Take a look, you can get the oldest tweets: https://github.com/Jefferson-Henrique/GetOldTweets-python

Found one code that would help retrieve older tweets. https://github.com/Jefferson-Henrique/GetOldTweets-python

To get old tweets, run the following command in the directory where the code repository got extracted.

python Exporter.py --querysearch 'keyword' --since 2016-01-10 --until 2016-01-15 --maxtweets 1000

And it returned a file 'output_got.csv' with 1000 tweets during the above days with your keyword

You need to install a module 'pyquery' for this to work

PS: You can modify 'Exporter.py' python code file to get more tweet attributes as per your requirement.

2018 update: Twitter has Premium search APIs that can return results from the beginning of time (2006):

https://developer.twitter.com/en/docs/tweets/search/overview/premium#ProductPackages

Search Tweets: 30-day endpoint → provides Tweets from the previous 30 days.

Search Tweets: Full-archive endpoint → provides complete and instant access to Tweets dating all the way back to the first Tweet in March 2006.

With an example Python client: https://github.com/twitterdev/search-tweets-python

use the args "since" and "until" to adjust your timeframe. You are presently using since_id which is meant to correspond to twitter id values (not dates):

for tweet in tweepy.Cursor(api.search,
                           q="test",
                           since="2014-01-01",
                           until="2014-02-01",
                           lang="en").items():

As others have noted, the Twitter API has the date limitation, but not the actual advanced search as implemented on twitter.com. So so the solution is to use Python's wrapper for Selenium or PhantomJS to iterate through the twitter.com endpoint. Here's an implementation using Selenium that someone has posted on Github: https://github.com/bpb27/twitter_scraping/

You can use the Rest APIs to get tweets older than a week For more details visit the twitter API reference https://dev.twitter.com/rest/reference/get/statuses/user_timeline

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!