How do I return all the unique words from a text file using Python? For example:
I am not a robot
I am a human
Should return:
Using Regex and Set:
import re
words = re.findall('\w+', text.lower())
uniq_words = set(words)
Other way is creating a Dict and inserting the words like keys:
for i in range(len(doc)):
frase = doc[i].split(" ")
for palavra in frase:
if palavra not in dict_word:
dict_word[palavra] = 1
print dict_word.keys()