What is NLTK POS tagger asking me to download?

后端 未结 6 2312
感动是毒
感动是毒 2020-12-03 02:51

I just started using a part-of-speech tagger, and I am facing many problems.

I started POS tagging with the following:

import nltk
text=nltk.word_to         


        
相关标签:
6条回答
  • 2020-12-03 03:14

    When you type nltk.download() in Python, an NLTK Downloader interface gets displayed automatically.
    Click on Models and choose maxent_treebank_pos_. It gets installed automatically.

    import nltk 
    text=nltk.word_tokenize("We are going out.Just you and me.")
    print nltk.pos_tag(text)
    [('We', 'PRP'), ('are', 'VBP'), ('going', 'VBG'), ('out.Just', 'JJ'),
     ('you', 'PRP'), ('and', 'CC'), ('me', 'PRP'), ('.', '.')]
    
    0 讨论(0)
  • 2020-12-03 03:16
    nltk.download()
    

    Click on Models and choose maxent_treebank_pos_. It gets installed automatically.

    import nltk 
    text=nltk.word_tokenize("We are going out.Just you and me.")
    print nltk.pos_tag(text)
    [('We', 'PRP'), ('are', 'VBP'), ('going', 'VBG'), ('out.Just', 'JJ'),
     ('you', 'PRP'), ('and', 'CC'), ('me', 'PRP'), ('.', '.')]
    
    0 讨论(0)
  • 2020-12-03 03:23

    From the shell/terminal, you can use:

    python -m nltk.downloader maxent_treebank_pos_tagger
    

    (might need to be sudo on Linux)

    It will install maxent_treebank_pos_tagger (i.e. the standard treebank POS tagger in NLTK) and fix your issue.

    0 讨论(0)
  • 2020-12-03 03:24

    If nltk version is 3.4.5, do the below:

    import nltk
    nltk.download('averaged_perceptron_tagger')
    

    To check you nltk version, do the below:

    print (nltk.__version__)
    
    0 讨论(0)
  • 2020-12-03 03:29

    From NLTK versions higher than v3.2, please use:

    >>> import nltk
    >>> nltk.__version__
    '3.2.1'
    >>> nltk.download('averaged_perceptron_tagger')
    [nltk_data] Downloading package averaged_perceptron_tagger to
    [nltk_data]     /home/alvas/nltk_data...
    [nltk_data]   Package averaged_perceptron_tagger is already up-to-date!
    True
    

    For NLTK versions using the old MaxEnt model, i.e. v3.1 and below, please use:

    >>> import nltk
    >>> nltk.download('maxent_treebank_pos_tagger')
    [nltk_data] Downloading package maxent_treebank_pos_tagger to
    [nltk_data]     /home/alvas/nltk_data...
    [nltk_data]   Package maxent_treebank_pos_tagger is already up-to-date!
    True
    

    For more details on the change in the default pos_tag, please see https://github.com/nltk/nltk/pull/1143

    0 讨论(0)
  • 2020-12-03 03:29
    import nltk
    text = "Obama delivers his first speech."
    
    sent  =  nltk.sent_tokenize(text)
    
    
    loftags = []
    for s in sent:
        d = nltk.word_tokenize(s)   
    
        print nltk.pos_tag(d)
    

    Result :

    akshayy@ubuntu:~/summ$ python nn1.py [('Obama', 'NNP'), ('delivers', 'NNS'), ('his', 'PRP$'), ('first', 'JJ'), ('speech', 'NN'), ('.', '.')]

    ( I just asked another question where used this code )

    0 讨论(0)
提交回复
热议问题