i am processing millions of tweets. each file i am pickling is created after about 400k processed tweets. each process tweet contains a dictionary and another 4 fields of in