问题
I want to generate sentence from grammar retrived from stanford parser, but NLTK is not able to interpret PRP$.
from nltk.parse.stanford import StanfordParser
from nltk.grammar import CFG
from nltk.parse.generate import generate
sp=StanfordParser(model_path='/home/aman/stanford_resource/stanford-parser-full-2014-06-16/stanford-parser-3.4-models/edu/stanford/nlp/models/lexparser/englishPCFG.ser.gz',path_to_jar='/home/aman/stanford_resource/stanford-parser-full-2014-06-16/stanford-parser.jar',path_to_models_jar='/home/aman/stanford_resource/stanford-postagger-full-2014-08-27/stanford-postagger-3.4.1.jar')
sent1='He killed the tiger in his pants'
parse_result=sp.raw_parse(sent1)
grammar_list=[]
for p in parse_result:
l=p.productions()
grammar_string='\n'.join(map(str,l))
grammar=CFG.fromstring(grammar_string)
#grammar_list.append(grammar)
#for s in generate(grammar,n=3):
# print s
ValueError: Unable to parse line 11: NP -> PRP$ NNS
Expected a nonterminal, found: $ NNS
how can it be worked.Should i specifically instruct nltk for these grammar categories.
回答1:
ValueError: Unable to parse line 11: NP -> PRP$ NNS
Expected a nonterminal, found: $ NNS
I've no idea why you are trying to combine a hand-built CFG with the output of the Stanford parser, but here's a solution to this problem:
I quick inspection of nltk/grammar.py
shows that $
is not a legal character for a non-terminal name. This can be easily corrected by patching the module like this:
import nltk
import re
nltk.grammar._STANDARD_NONTERM_RE = re.compile('( [\w/][\w$/^<>-]* ) \s*', re.VERBOSE)
In the above I just added $
to the regexp that you'll find in nltk/grammar.py
. You can then create and use grammars that have $
in their productions:
grammar = nltk.grammar.CFG.fromstring("NP -> PRP$ NNS")
来源:https://stackoverflow.com/questions/33207799/nltk-cant-interpret-grammar-category-prp-output-by-stanford-parser