How can I split a text into sentences?

前端 未结 13 1173
傲寒
傲寒 2020-11-22 06:33

I have a text file. I need to get a list of sentences.

How can this be implemented? There are a lot of subtleties, such as a dot being used in abbreviations.

13条回答
  •  刺人心
    刺人心 (楼主)
    2020-11-22 06:37

    Was working on similar task and came across this query, by following few links and working on few exercises for nltk the below code worked for me like magic.

    from nltk.tokenize import sent_tokenize 
      
    text = "Hello everyone. Welcome to GeeksforGeeks. You are studying NLP article"
    sent_tokenize(text) 
    

    output:

    ['Hello everyone.',
     'Welcome to GeeksforGeeks.',
     'You are studying NLP article']
    

    Source: https://www.geeksforgeeks.org/nlp-how-tokenizing-text-sentence-words-works/

提交回复
热议问题