问题
I have the following code which allows me to view a stream of 1% of the twitter firehose via python:
import sys
import tweepy
consumer_key=""
consumer_secret=""
access_key = ""
access_secret = ""
auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_key, access_secret)
api = tweepy.API(auth)
class CustomStreamListener(tweepy.StreamListener):
def on_status(self, status):
if '' in status.text.lower():
print status.text
print status.coordinates
def on_error(self, status_code):
print >> sys.stderr, 'Encountered error with status code:', status_code
return True # Don't kill the stream
def on_timeout(self):
print >> sys.stderr, 'Timeout...'
return True # Don't kill the stream
sapi = tweepy.streaming.Stream(auth, CustomStreamListener())
sapi.filter(track=['example'])
I know the syntax include_rts = False
will remove retweets from the stream I am viewing, but I am not sure where to add it to the above code.
Can anyone assist?
Thanks
回答1:
Add the following condition to the on_status function in your listener:
def on_status(self, status):
if '' in status.text.lower() and 'retweeted_status' not in status:
print status.text
print status.coordinates
来源:https://stackoverflow.com/questions/27608059/where-to-exclude-retweets-in-this-tweepy-script