I have POS tagged some words with nltk.pos_tag(), so they are given treebank tags. I would like to lemmatize these words using the known POS tags, but I am not sure how. I w
As @engineercoding pointed out in the comments to @rmalouf's answer, there are quite a lot more tags in Treebank compared to WordNet, see here for details.
The following mapping covers as large number of bases as possible, it also explicitly defines POS tags without matches in WordNet:
# Create a map between Treebank and WordNet
from nltk.corpus import wordnet as wn
# WordNet POS tags are: NOUN = 'n', ADJ = 's', VERB = 'v', ADV = 'r', ADJ_SAT = 'a'
# Descriptions (c) https://web.stanford.edu/~jurafsky/slp3/10.pdf
tag_map = {
'CC':None, # coordin. conjunction (and, but, or)
'CD':wn.NOUN, # cardinal number (one, two)
'DT':None, # determiner (a, the)
'EX':wn.ADV, # existential ‘there’ (there)
'FW':None, # foreign word (mea culpa)
'IN':wn.ADV, # preposition/sub-conj (of, in, by)
'JJ':[wn.ADJ, wn.ADJ_SAT], # adjective (yellow)
'JJR':[wn.ADJ, wn.ADJ_SAT], # adj., comparative (bigger)
'JJS':[wn.ADJ, wn.ADJ_SAT], # adj., superlative (wildest)
'LS':None, # list item marker (1, 2, One)
'MD':None, # modal (can, should)
'NN':wn.NOUN, # noun, sing. or mass (llama)
'NNS':wn.NOUN, # noun, plural (llamas)
'NNP':wn.NOUN, # proper noun, sing. (IBM)
'NNPS':wn.NOUN, # proper noun, plural (Carolinas)
'PDT':[wn.ADJ, wn.ADJ_SAT], # predeterminer (all, both)
'POS':None, # possessive ending (’s )
'PRP':None, # personal pronoun (I, you, he)
'PRP$':None, # possessive pronoun (your, one’s)
'RB':wn.ADV, # adverb (quickly, never)
'RBR':wn.ADV, # adverb, comparative (faster)
'RBS':wn.ADV, # adverb, superlative (fastest)
'RP':[wn.ADJ, wn.ADJ_SAT], # particle (up, off)
'SYM':None, # symbol (+,%, &)
'TO':None, # “to” (to)
'UH':None, # interjection (ah, oops)
'VB':wn.VERB, # verb base form (eat)
'VBD':wn.VERB, # verb past tense (ate)
'VBG':wn.VERB, # verb gerund (eating)
'VBN':wn.VERB, # verb past participle (eaten)
'VBP':wn.VERB, # verb non-3sg pres (eat)
'VBZ':wn.VERB, # verb 3sg pres (eats)
'WDT':None, # wh-determiner (which, that)
'WP':None, # wh-pronoun (what, who)
'WP$':None, # possessive (wh- whose)
'WRB':None, # wh-adverb (how, where)
'$':None, # dollar sign ($)
'#':None, # pound sign (#)
'“':None, # left quote (‘ or “)
'”':None, # right quote (’ or ”)
'(':None, # left parenthesis ([, (, {, <)
')':None, # right parenthesis (], ), }, >)
',':None, # comma (,)
'.':None, # sentence-final punc (. ! ?)
':':None # mid-sentence punc (: ; ... – -)
}