streaming api with tweepy only returns second last tweet and NOT the immediately last tweet

元气小坏坏 提交于 2019-12-04 13:45:35

I had this same problem. The answer was not as easy as running python unbuffered in my case, and I presume it didn't solve the original poster's problem as well. The problem is actually in the code for the tweepy package in a file called streaming.py and function _read_loop() which I think needs to be updated to reflect changes to the format that twitter outputs data from their streaming api.

The solution for me was to download the newest code for tweepy from github, https://github.com/tweepy/tweepy specifically the streaming.py file. You can view the changes being made recently to try to resolve this issue in the commit history for this file.

I looked into the details of the tweepy class, and there was an issue with the way the streaming.py class reads in the json tweet stream. I think it has to do with twitter updating their streaming api to include the number of bits of an incoming status. Long story short, here was the function I replaced in streaming.py to resolve this question.

def _read_loop(self, resp):

    while self.running and not resp.isclosed():

        # Note: keep-alive newlines might be inserted before each length value.
        # read until we get a digit...
        c = '\n'
        while c == '\n' and self.running and not resp.isclosed():
            c = resp.read(1)
        delimited_string = c

        # read rest of delimiter length..
        d = ''
        while d != '\n' and self.running and not resp.isclosed():
            d = resp.read(1)
            delimited_string += d

        try:
            int_to_read = int(delimited_string)
            next_status_obj = resp.read( int_to_read )
            # print 'status_object = %s' % next_status_obj
            self._data(next_status_obj)
        except ValueError:
            pass 

    if resp.isclosed():
        self.on_closed(resp)

This solution also requires learning how to download the source code for the tweepy package, modifying it, and then installing the modified library into python. Which is done by going into your top level tweepy directory and typing something like sudo setup.py install depending on your system.

I've also commented to the coders on github for this package to let them know whats up.

Burhan Khalid

This is a case of output buffering. Run python with -u (unbuffered) to prevent this from happening.

Or, you can force the buffer to be flushed by executing a sys.stdout.flush() after your print statement.

See this answer for more ideas.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!