How to identify the subject of a sentence?

前端 未结 6 1590
我在风中等你
我在风中等你 2020-12-14 18:53

Can Python + NLTK be used to identify the subject of a sentence? From what I have learned till now is that a sentence can be broken into a head and its dependents. For e.g.

6条回答
  •  自闭症患者
    2020-12-14 19:27

    rake_nltk (pip install rake_nltk) is a python library that wraps nltk and apparently uses the RAKE algorithm.

    from rake_nltk import Rake
    
    rake = Rake()
    
    kw = rake.extract_keywords_from_text("Can Python + NLTK be used to identify the subject of a sentence?")
    
    ranked_phrases = rake.get_ranked_phrases()
    
    print(ranked_phrases)
    
    # outputs the keywords ordered by rank
    >>> ['used', 'subject', 'sentence', 'python', 'nltk', 'identify']
    
    

    By default the stopword list from nltk is used. You can provide your custom stopword list and punctuation chars by passing them in the constructor:

    rake = Rake(stopwords='mystopwords.txt', punctuations=''',;:!@#$%^*/\''')
    

    By default string.punctuation is used for punctuation.

    The constructor also accepts a language keyword which can be any language supported by nltk.

提交回复
热议问题