I try to find the dependency path between two words in Python given dependency tree.
For sentence
Robots in popular culture are there to remind u
HugoMailhot's answer is great. I'll write something similar for spacy users who want to find the shortest dependency path between two words (whereas HugoMailhot's answer relies on practNLPTools).
The sentence:
Robots in popular culture are there to remind us of the awesomeness of unbound human agency.
has the following dependency tree:
Here is the code to find the shortest dependency path between two words:
import networkx as nx
import spacy
nlp = spacy.load('en')
# https://spacy.io/docs/usage/processing-text
document = nlp(u'Robots in popular culture are there to remind us of the awesomeness of unbound human agency.', parse=True)
print('document: {0}'.format(document))
# Load spacy's dependency tree into a networkx graph
edges = []
for token in document:
# FYI https://spacy.io/docs/api/token
for child in token.children:
edges.append(('{0}-{1}'.format(token.lower_,token.i),
'{0}-{1}'.format(child.lower_,child.i)))
graph = nx.Graph(edges)
# https://networkx.github.io/documentation/networkx-1.10/reference/algorithms.shortest_paths.html
print(nx.shortest_path_length(graph, source='robots-0', target='awesomeness-11'))
print(nx.shortest_path(graph, source='robots-0', target='awesomeness-11'))
print(nx.shortest_path(graph, source='robots-0', target='agency-15'))
Output:
4
['robots-0', 'are-4', 'remind-7', 'of-9', 'awesomeness-11']
['robots-0', 'are-4', 'remind-7', 'of-9', 'awesomeness-11', 'of-12', 'agency-15']
To install spacy and networkx:
sudo pip install networkx
sudo pip install spacy
sudo python -m spacy.en.download parser # will take 0.5 GB
Some benchmarks regarding spacy's dependency parsing: https://spacy.io/docs/api/