How to ignore punctuation in-between words using word_tokenize in NLTK?
问题 I'm looking to ignore characters in-between words using NLTK word_tokenize. If I have a a sentence: test = 'Should I trade on the S&P? This works with a phone number 333-445-6635 and email test@testing.com' The word_tokenize method is splitting the S&P into 'S','&','P','?' Is there a way to have this library ignore punctuation between words or letters? Expected output: 'S&P','?' 回答1: Let me know how this works with your sentences. I added an additional test with a bunch of punctuation. The