问题
I am using nltk's Tree data structure.Below is the sample nltk.Tree.
(S
(S
(ADVP (RB recently))
(NP (NN someone))
(VP
(VBD mentioned)
(NP (DT the) (NN word) (NN malaria))
(PP (TO to) (NP (PRP me)))))
(, ,)
(CC and)
(IN so)
(S
(NP
(NP (CD one) (JJ whole) (NN flood))
(PP (IN of) (NP (NNS memories))))
(VP (VBD came) (S (VP (VBG pouring) (ADVP (RB back))))))
(. .))
I am not aware of nltk.Tree datastructure. I want to extract the parent and the super parent node for every leaf node e.g. for 'recently' I want (ADVP, RB), and for 'someone' it is (NP, NN)This is the final outcome i want.Earlier answer used eval() function to do so which i want to avoid.
[('ADVP', 'RB'), ('NP', 'NN'), ('VP', 'VBD'), ('NP', 'DT'), ('NP', 'NN'), ('NP', 'NN'), ('PP', 'TO'), ('NP', 'PRP'), ('S', 'CC'), ('S', 'IN'), ('NP', 'CD'), ('NP', 'JJ'), ('NP', 'NN'), ('PP', 'IN'), ('NP', 'NNS'), ('VP', 'VBD'), ('VP', 'VBG'), ('ADVP', 'RB')]
回答1:
Python code for the same without using eval function and using nltk tree datastructure
sentences = " (S
(S
(ADVP (RB recently))
(NP (NN someone))
(VP
(VBD mentioned)
(NP (DT the) (NN word) (NN malaria))
(PP (TO to) (NP (PRP me)))))
(, ,)
(CC and)
(IN so)
(S
(NP
(NP (CD one) (JJ whole) (NN flood))
(PP (IN of) (NP (NNS memories))))
(VP (VBD came) (S (VP (VBG pouring) (ADVP (RB back))))))
(. .))"
print list(tails(sentences))
def tails(items, path=()):
for child in items:
if type(child) is nltk.Tree:
if child.label() in {".", ","}: # ignore punctuation
continue
for result in tails(child, path + (child.label(),)):
yield result
else:
yield path[-2:]
来源:https://stackoverflow.com/questions/29397460/extract-parent-and-child-node-from-python-tree