How to find the shortest dependency path between two words in Python?

前端 未结 3 2015
无人共我
无人共我 2021-02-07 07:25

I try to find the dependency path between two words in Python given dependency tree.

For sentence

Robots in popular culture are there to remind u

3条回答
  •  忘掉有多难
    2021-02-07 07:51

    HugoMailhot's answer is great. I'll write something similar for spacy users who want to find the shortest dependency path between two words (whereas HugoMailhot's answer relies on practNLPTools).

    The sentence:

    Robots in popular culture are there to remind us of the awesomeness of unbound human agency.

    has the following dependency tree:

    Here is the code to find the shortest dependency path between two words:

    import networkx as nx
    import spacy
    nlp = spacy.load('en')
    
    # https://spacy.io/docs/usage/processing-text
    document = nlp(u'Robots in popular culture are there to remind us of the awesomeness of unbound human agency.', parse=True)
    
    print('document: {0}'.format(document))
    
    # Load spacy's dependency tree into a networkx graph
    edges = []
    for token in document:
        # FYI https://spacy.io/docs/api/token
        for child in token.children:
            edges.append(('{0}-{1}'.format(token.lower_,token.i),
                          '{0}-{1}'.format(child.lower_,child.i)))
    
    graph = nx.Graph(edges)
    
    # https://networkx.github.io/documentation/networkx-1.10/reference/algorithms.shortest_paths.html
    print(nx.shortest_path_length(graph, source='robots-0', target='awesomeness-11'))
    print(nx.shortest_path(graph, source='robots-0', target='awesomeness-11'))
    print(nx.shortest_path(graph, source='robots-0', target='agency-15'))
    

    Output:

    4
    ['robots-0', 'are-4', 'remind-7', 'of-9', 'awesomeness-11']
    ['robots-0', 'are-4', 'remind-7', 'of-9', 'awesomeness-11', 'of-12', 'agency-15']
    

    To install spacy and networkx:

    sudo pip install networkx 
    sudo pip install spacy
    sudo python -m spacy.en.download parser # will take 0.5 GB
    

    Some benchmarks regarding spacy's dependency parsing: https://spacy.io/docs/api/

提交回复
热议问题