I have a text file. I need to get a list of sentences.
How can this be implemented? There are a lot of subtleties, such as a dot being used in abbreviations.
Was working on similar task and came across this query, by following few links and working on few exercises for nltk the below code worked for me like magic.
from nltk.tokenize import sent_tokenize
text = "Hello everyone. Welcome to GeeksforGeeks. You are studying NLP article"
sent_tokenize(text)
output:
['Hello everyone.',
'Welcome to GeeksforGeeks.',
'You are studying NLP article']
Source: https://www.geeksforgeeks.org/nlp-how-tokenizing-text-sentence-words-works/