Background:
I have a python module set up to grab JSON objects from a streaming API and store them (bulk insert of 25 at a time) in MongoDB using p
Got rid of the StringIO library. As the WRITEFUNCTION callback handle_data, in this case, gets invoked for every line, just load the JSON directly. Sometimes, however, there could be two JSON objects contained in data. I am sorry, I can't post the curl command that I use as it contains our credentials. But, as I said, this is a general issue applicable to any streaming API.
def handle_data(self, buf):
try:
self.tweet = json.loads(buf)
except Exception as json_ex:
self.data_list = buf.split('\r\n')
for data in self.data_list:
self.tweet_list.append(json.loads(data))