I think what I want to do is a fairly common task but I\'ve found no reference on the web. I have text with punctuation, and I want a list of the words.
\"H
In Python 3, your can use the method from PY4E - Python for Everybody.
We can solve both these problems by using the string methods
lower,punctuation, andtranslate. Thetranslateis the most subtle of the methods. Here is the documentation fortranslate:
your_string.translate(your_string.maketrans(fromstr, tostr, deletestr))
Replace the characters in
fromstrwith the character in the same position intostrand delete all characters that are indeletestr. Thefromstrandtostrcan be empty strings and thedeletestrparameter can be omitted.
Your can see the "punctuation":
In [10]: import string
In [11]: string.punctuation
Out[11]: '!"#$%&\'()*+,-./:;<=>?@[\\]^_`{|}~'
For your example:
In [12]: your_str = "Hey, you - what are you doing here!?"
In [13]: line = your_str.translate(your_str.maketrans('', '', string.punctuation))
In [14]: line = line.lower()
In [15]: words = line.split()
In [16]: print(words)
['hey', 'you', 'what', 'are', 'you', 'doing', 'here']
For more information, you can refer: