Stanford NLP - VP vs NP

陌路散爱 提交于 2019-12-10 21:27:48

问题


I have one example where Stanford NLP outputs a weird parse tree for the sentence:

Clean my desk

(ROOT
  (NP
    (NP (JJ Clean))
    (NP (PRP$ my) (NN desk))))

As you can see, it tags the word Clean as an adjective dependent on the verb desk with the whole phrase being tagged as a Noun Phrase, while my expectation is for Clean to be tagged as a verb, and the phase as a Verb Phrase.

The JJ-PRP$-NN combination simply doesn't make sense in English to me. Anyone ever run into something similar? I know that Stanford NLP results sometimes differ based on the sequence (?) of parsing tools run. How to make this tag properly?


回答1:


CoreNLP is notoriously bad at these sorts of imperative statements. This error is likely from the POS tagger mis-tagging "clean" as an adjective, although it appears the parser is also making the same mistake.




回答2:


As it happens, if you feed the sentence "Clean my desk" directly to the parser (actually, the 'tokenize', 'ssplit' and 'parse' tools), it gives the following result:

(ROOT (NP (NP (NNP Clean)) (NP (PRP$ my) (NN desk))))

However, now "Clean" is a Proper Noun - very clever, Stanford. So, if we feed the sentence in with the first word in lowercase - "clean my desk" - we finally get what we are looking for:

(ROOT (S (VP (VB clean) (NP (PRP$ my) (NN desk)))))

Be careful not to convert the full sentence into lowercase. While testing I've noticed the the word "I" turned into lowercase "i" is tagged as FW (Foreign Word), so only covert the first word to lowercase.



来源:https://stackoverflow.com/questions/35872324/stanford-nlp-vp-vs-np

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!