Determining whether a word is a noun or not

▼魔方 西西 提交于 2019-12-03 03:21:29

If you simply want to check whether or not a single word can be used as a noun, the quickest way might be to build a set of all nouns and then just check the word for membership of that set.

For a list of all nouns you could use the WordNet corpus (which can be accessed through NLTK for example):

>>> from nltk.corpus import wordnet as wn
>>> nouns = {x.name().split('.', 1)[0] for x in wn.all_synsets('n')}
>>> "cook" in nouns
True
>>> "and" in nouns
False
Gabor Angeli

I can't speak for the Python wrapper, but if you use the Stanford POS tagger rather than the parser, it should be much quicker. There are wrappers for Stanford CoreNLP, which includes the tagger: https://pypi.python.org/pypi/corenlp-python; or, it looks like nltk has a Stanford tagger module too http://www.nltk.org/_modules/nltk/tag/stanford.html .

You may also get better results if you embed the single word in a toy sentence. Something like "The X is a thing." Depending on the sentence, this can bias you towards or away from guessing words as nouns too.

Josep Valls

I would second the use of Wordnet if you are checking single words. I also used the freely available TreeTagger: http://www.cis.uni-muenchen.de/~schmid/tools/TreeTagger/ The binary runs really fast and has support for multiple languages. If you need a pure Pythonic solution, check the NLTK implementation of the Brill Tagger: http://www.nltk.org/_modules/nltk/tag/brill.html

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!