Resource u'tokenizers/punkt/english.pickle' not found

前端 未结 17 2085
粉色の甜心
粉色の甜心 2020-12-13 01:49

My Code:

import nltk.data
tokenizer = nltk.data.load(\'nltk:tokenizers/punkt/english.pickle\')

ERROR Message:

[ec2-user@ip-         


        
17条回答
  •  别那么骄傲
    2020-12-13 02:38

    import nltk
    nltk.download('punkt')
    

    Open the Python prompt and run the above statements.

    The sent_tokenize function uses an instance of PunktSentenceTokenizer from the nltk.tokenize.punkt module. This instance has already been trained and works well for many European languages. So it knows what punctuation and characters mark the end of a sentence and the beginning of a new sentence.

提交回复
热议问题