CFG using POS tags in NLTK [closed]

无人久伴 提交于 2019-12-19 05:00:52

问题


I am trying to check if a given sentence is grammatical using NLTK.

Ex:

OK : The whale licks the sadness

NOT OK : The best I ever had

I know that I could do POS tagging, then use a CFG parser and check that way, but I have yet to find a CFG that uses POS tags instead of actual words as terminal branches.

Is there a CFG that anyone can recommend? I think that making my own is silly, because I am not a linguist and will probably leave out important structures.

Also, my application is such that the system would ideally reject many sentences and only approve sentences it is extremely sure of.

Thanks :D


回答1:


The terminal nodes of the CFG can be anything, even POS tags. As long as your phrasal rules recognize POS instead of words as the input, there shouldn't be a problem to declare the grammar with POS.

import nltk
# Define the cfg grammar.
grammar = nltk.parse_cfg("""
S -> NP VP
NP -> 'DT' 'NN'
VP -> 'VB'
VP -> 'VB' 'NN'
""")


# Make your POS sentence into a list of tokens.
sentence = "DT NN VB NN".split(" ")

# Load the grammar into the ChartParser.
cp = nltk.ChartParser(grammar)

# Generate and print the nbest_parse from the grammar given the sentence tokens.
for tree in cp.nbest_parse(sentence):
    print tree


来源:https://stackoverflow.com/questions/15003136/cfg-using-pos-tags-in-nltk

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!