This is a preprocessing pipeline for NLP data for word prediction.
def set_of_words_in(sequences): set_of_words={sublst for lst in sequences for sublst in lst}