I have around 82 gzipped files (around 180MB each and 14GB total) where each file contains new line separated sentences. I am thinking of using PathLineSentences from gensim