问题
I want to identify subject and objects of a set of sentences . My actual work is to identify cause and effect from a set of review data.
I am using Spacy Package to chunk and parse data. But not actually reaching my goal. Is there any way to do so?
E.g.:
I thought it was the complete set
out:
subject object
I complete set
回答1:
In the simplest way. The dependencies are accessed by token.dep_ Having imported spacy:
import spacy
nlp = spacy.load('en')
parsed_text = nlp(u"I thought it was the complete set")
#get token dependencies
for text in parsed_text:
#subject would be
if text.dep_ == "nsubj":
subject = text.orth_
#iobj for indirect object
if text.dep_ == "iobj":
indirect_object = text.orth_
#dobj for direct object
if text.dep_ == "dobj":
direct_object = text.orth_
print(subject)
print(direct_object)
print(indirect_object)
回答2:
You can use noun chunk.
Code
doc = nlp("I thought it was the complete set")
for nc in doc.noun_chunks:
print(nc.text)
Result:
I
it
the complete set
To pick only "I" instead of both "I" and "it", you can write a test first to take the nsubj left of ROOT.
来源:https://stackoverflow.com/questions/37297399/subject-object-identification-in-python