How to navigate a nltk.tree.Tree?

匿名 (未验证) 提交于 2019-12-03 01:59:02

问题:

I've chunked a sentence using:

grammar = '''                                                                                                                   NP:                                                                                                                            {
*(|)*} NVN: {} ''' chunker = nltk.chunk.RegexpParser(grammar) tree = chunker.parse(tagged) print tree

The result looks like:

(S   (NVN     (NP The_Pigs/NNS)     are/VBP     (NP a/DT Bristol-based/JJ punk/NN rock/NN band/NN))   that/WDT   formed/VBN   in/IN   1977/CD   ./.) 

But now I'm stuck trying to figure out how to navigate that. I want to be able to find the NVN subtree, and access the left-side noun phrase ("The_Pigs"), the verb ("are") and the right-side noun phrase ("a Bristol-based punk rock band"). How do I do that?

回答1:

Try:

ROOT = 'ROOT' tree = ... def getNodes(parent):     for node in parent:         if type(node) is nltk.Tree:             if node.label() == ROOT:                 print "======== Sentence ========="                 print "Sentence:", " ".join(node.leaves())             else:                 print "Label:", node.label()                 print "Leaves:", node.leaves()              getNodes(node)         else:             print "Word:", node  getNodes(tree) 


回答2:

You could, of course, write your own depth first search... but there is an easier (better) way. If you want every subtree rooted at NVM, use Tree's subtree method with the filter parameter defined.

>>> print t (S     (NVN         (NP The_Pigs/NNS)         are/VBP         (NP a/DT Bristol-based/JJ punk/NN rock/NN band/NN))     that/WDT     formed/VBN     in/IN     1977/CD     ./.) >>> for i in t.subtrees(filter=lambda x: x.node == 'NVN'): ...     print i ...  (NVN     (NP The_Pigs/NNS)     are/VBP     (NP a/DT Bristol-based/JJ punk/NN rock/NN band/NN)) 


回答3:

Try this:

for a in tree:         if type(a) is nltk.Tree:             if a.node == 'NVN': # This climbs into your NVN tree                 for b in a:                     if type(b) is nltk.Tree and b.node == 'NP':                         print b.leaves() # This outputs your "NP"                     else:                         print b # This outputs your "VB.*" 

It outputs this:

[('The_Pigs', 'NNS')]

('are', 'VBP')

[('a', 'DT'), ('Bristol-based', 'JJ'), ('punk', 'NN'), ('rock', 'NN'), ('band', 'NN')]



回答4:

here's a code sample for generating all the subtrees with a label 'NP'

def filt(x):     return x.label()=='NP'  for subtree in t.subtrees(filter =  filt): # Generate all subtrees     print subtree 

for siblings, you might wanna take a look at the method ParentedTree.left_siblings()

for more details, here are some useful links.

http://www.nltk.org/howto/tree.html #some basic usage and example http://nbviewer.ipython.org/github/gmonce/nltk_parsing/blob/master/1.%20NLTK%20Syntax%20Trees.ipynb #a notebook playwith these methods

http://www.nltk.org/_modules/nltk/tree.html #all api with source



标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!