How do I save streaming tweets in json via tweepy?

前端 未结 3 1042
醉话见心
醉话见心 2020-12-24 03:50

I\'ve been learning Python for a couple of months through online courses and would like to further my learning through a real world mini project.

For this project,

3条回答
  •  时光取名叫无心
    2020-12-24 04:37

    I just insert the raw JSON into the database. It seems a bit ugly and hacky but it does work. A noteable problem is that the creation dates of the Tweets are stored as strings. How do I compare dates from Twitter data stored in MongoDB via PyMongo? provides a way to fix that (I inserted a comment in the code to indicate where one would perform that task)

    # ...
    
    client = pymongo.MongoClient()
    db = client.twitter_db
    twitter_collection = db.tweets
    
    # ...
    
    class CustomStreamListener(tweepy.StreamListener):
        # ...
        def on_status(self, status):
                try:
                    twitter_json = status._json
                    # TODO: Transform created_at to Date objects before insertion
                    tweet_id = twitter_collection.insert(twitter_json)
                except:
                    # Catch any unicode errors while printing to console
                    # and just ignore them to avoid breaking application.
                    pass
        # ...
    
    stream = tweepy.Stream(auth, CustomStreamListener(), timeout=None, compression=True)
    stream.sample()
    

提交回复
热议问题