Better way to use SpaCy to parse sentences?

依然范特西╮ 提交于 2019-12-13 03:31:24

问题


I'm using SpaCy to find sentences that contain 'is' or 'was' that have pronouns as their subjects and return the object of the sentence. My code works, but I feel like there must be a much better way to do this.

import spacy
nlp = spacy.load('en_core_web_sm')

ex_phrase = nlp("He was a genius. I really liked working with him. He is a dog owner. She is very kind to animals.")


#create an empty list to hold any instance of this particular construction
list_of_responses = []

#split into sentences
for sent in ex_phrase.sents:
    for token in sent:
        #check to see if the word 'was' or 'is' is in each sentence, if so, make a list of the verb's constituents
        if token.text == 'was' or token.text == 'is':
            dependency = [child for child in token.children]
            #if the first constituent is a pronoun, make sent_object equal to the item at index 1 in the list of constituents
            if dependency[0].pos_ == 'PRON':
                sent_object = dependency[1]

    #create a string of the entire object of the verb. For instance, if sent_object = 'genius', this would create a string 'a genius'
    for token in sent:
        if token == sent_object:
            whole_constituent = [t.text for t in token.subtree]
            whole_constituent = " ".join(whole_constituent)

    #check to see what the pronoun was, and depending on if it was 'he' or 'she', construct a coherent followup sentence
    if dependency[0].text.lower() == 'he':
        returning_phrase = f"Why do you think him being {whole_constituent} helped the two of you get along?"
    elif dependency[0].text.lower() == 'she':
        returning_phrase = f"Why do you think her being {whole_constituent} helped the two of you get along?"

    #add each followup sentence to the list. For some reason it creates a lot of duplicates, so I have to use set
    list_of_responses.append(returning_phrase)
    list_of_responses = list(set(list_of_responses))

回答1:


It seems like your code is trying to do something more complicated than what you describe in your question. I have tried to do what it looks like you want to do with your code. Getting the object/attribute of a verb "is" or "was" is just part of this.

import spacy
from pprint import pprint

nlp = spacy.load('en')

text = "He was a genius. I really liked working with him. He is a dog owner. She is very kind to animals."

def get_pro_nsubj(token):
    # get the (lowercased) subject pronoun if there is one
    return [child.lower_ for child in token.children if child.dep_ == 'nsubj'][0]

list_of_responses = []

# a mapping of subject to object pronouns
subj_obj_pro_map = {'he': 'him',
                    'she': 'her'
                    }

for token in nlp(text):
    if token.pos_ in ['NOUN', 'ADJ']:
        if token.dep_ in ['attr', 'acomp'] and token.head.lower_ in ['is', 'was']:
            # to test for lemma 'be' use token.head.lemma_ == 'be'
            nsubj = get_pro_nsubj(token.head)
            if nsubj in ['he', 'she']:
                # get the text of each token in the constituent and join it all together
                whole_constituent = ' '.join([t.text for t in token.subtree])
                obj_pro = subj_obj_pro_map[nsubj] # convert subject to object pronoun
                returning_phrase = 'Why do you think {} being {} helped the two of you get along?'.format(obj_pro, whole_constituent)
                list_of_responses.append(returning_phrase)

pprint(list_of_responses)

Which outputs this:

['Why do you think him being a genius helped the two of you get along?',
 'Why do you think him being a dog owner helped the two of you get along?',
 'Why do you think her being very kind to animals helped the two of you get '
 'along?']


来源:https://stackoverflow.com/questions/56977820/better-way-to-use-spacy-to-parse-sentences

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!