I have made a generator to read a file word by word and it works nicely.
def word_reader(file):
for line in open(file):
for p in line.split():
To get the first n values of a generator, you can use more_itertools.take.
If you plan to iterate over the words in chunks (eg. 100 at a time), you can use more_itertools.chunked (https://more-itertools.readthedocs.io/en/latest/api.html):
import more_itertools
for words in more_itertools.chunked(reader, n=100):
# process 100 words