I am using nltk, so I want to create my own custom texts just like the default ones on nltk.books. However, I\'ve just got up to the method like
my_text = [\
As @PavelAnossov answered, the canonical answer, use the word_tokenize function in nltk:
from nltk import word_tokenize
sent = "This is my text, this is a nice way to input text."
word_tokenize(sent)
If your sentence is truly simple enough:
Using the string.punctuation set, remove punctuation then split using the whitespace delimiter:
import string
x = "This is my text, this is a nice way to input text."
y = "".join([i for i in x if not in string.punctuation]).split(" ")
print y