How to print out tags in python

社会主义新天地 提交于 2020-01-05 09:33:37

问题


If I have a string such as this:

text = "They refuse to permit us."

txt = nltk.word_tokenize(text)

With this if I print POS tags; nltk.pos_tag(txt) I get

[('They','PRP'), ('refuse', 'VBP'), ('to', 'TO'), ('permit', 'VB'), ('us', 'PRP')]

How can I print out only this:

['PRP', 'VBP', 'TO', 'VB', 'PRP']


回答1:


You got a list of tuples, you should iterate through it to get only the second element of each tuple.

>>> tagged = nltk.pos_tag(txt)
>>> tags =  [ e[1] for e in tagged]
>>> tags
['PRP', 'VBP', 'TO', 'VB', 'PRP'] 



回答2:


Take a look at Unpacking a list / tuple of pairs into two lists / tuples

>>> from nltk import pos_tag, word_tokenize
>>> text = "They refuse to permit us."
>>> tagged_text = pos_tag(word_tokenize(text))
>>> tokens, pos = zip(*tagged_text)
>>> pos
('PRP', 'VBP', 'TO', 'VB', 'PRP', '.')

Possibly at some point you will find the POS tagger is slow and you will need to do this (see Slow performance of POS tagging. Can I do some kind of pre-warming?):

>>> from nltk import pos_tag, word_tokenize
>>> from nltk.tag import PerceptronTagger
>>> tagger = PerceptronTagger()
>>> text = "They refuse to permit us."
>>> tagged_text = tagger.tag(word_tokenize(text))
>>> tokens, pos = zip(*tagged_text)
>>> pos
('PRP', 'VBP', 'TO', 'VB', 'PRP', '.')



回答3:


You can iterate like -

print [x[1] for x in nltk.pos_tag(txt)]


来源:https://stackoverflow.com/questions/34609285/how-to-print-out-tags-in-python

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!