How to determine subject, object and other words?

别来无恙 提交于 2019-12-19 05:48:41

问题


I'm trying to implement application that can determine meaning of sentence, by dividing it to smaller pieces. So I need to know what words are subject, object etc. so that my program can know how to handle this sentence.


回答1:


This is an open research problem. You can get an overview on Wikipedia, http://en.wikipedia.org/wiki/Natural_language_processing. Consider phrases like "Time flies like an arrow, fruit flies like a banana" - unambiguously classifying words is not easy.




回答2:


You should look at the Natural Language Toolkit, which is for exactly this sort of thing.

See this section of the manual: Categorizing and Tagging Words - here's an extract:

>>> text = nltk.word_tokenize("And now for something completely different")
>>> nltk.pos_tag(text)
[('And', 'CC'), ('now', 'RB'), ('for', 'IN'), ('something', 'NN'),
('completely', 'RB'), ('different', 'JJ')]

"Here we see that and is CC, a coordinating conjunction; now and completely are RB, or adverbs; for is IN, a preposition; something is NN, a noun; and different is JJ, an adjective."




回答3:


I guess there is not "simple" way to do this. You have to build a linguistic analyzer (which is quite possible), however, a language as a lot of exceptional cases. And that is what makes implementing a linguistic analyzer that hard.




回答4:


The specific problem you mention, the identification of the subject and objects of a clause, is accomplished by syntactic parsing. You can get a good idea of how parsing works by using this demo of parsing software developed by Stanford University.

However, syntactic parsing does not determine the meanining of a sentence, only its structure. Determining meaning (semantics) is a very hard problem in general and there is no technology that can really 'understand' a sentence in the same way that a human would. Although there is no general solution, you may be able to do something in a very restricted subject domain. For example, is the data you want to analyse about a narrow topic with a limited set of 'things' that people talk about?




回答5:


StompChicken has given the right answer to this question, but I'd like to add that the concepts of subject and object are known as grammatical relations, and that Briscoe and Carroll's RASP is a parser that can go the extra step of deducing a list of relations from the parse.

Here's some example output from their demo page. It's an extract from the output for a sentence that begins "We describe a robust accurate domain-independent approach...":

(|ncsubj| |describe:2_VV0| |We:1_PPIS2| _)
(|dobj| |describe:2_VV0| |approach:7_NN1|)



来源:https://stackoverflow.com/questions/1318197/how-to-determine-subject-object-and-other-words

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!