Identifying verb tenses in python

给你一囗甜甜゛ 提交于 2019-12-03 08:27:28

It won't be too hard to do this yourself. This table should help you identify the different verb tenses and handling them will just be a matter of parsing the result of nltk.pos_tag(string)

Im not sure if you want to get into all of the irregular verb tenses like 'could have been' etc... but if you only want present/past/future this is a very easy parsing task.

I dont know of any library that will do this on its own, and I've always thought of training some model to decide this for me but never got around to it.

There will be some degree of error but it wont be large. I recommend parsing all verbs in order to decide how you want to handle the tense, because in sentences like: I am happy he will see her. The tense is present, but there is a future tense clause ( [that] he will see her ) So this gets into the linguistics of your problem which you didn't elaborate but you get the idea.

POS tagging - which gives you tags that let you look at the tense of the verb - already takes into account sentence context, so it addresses your issues re. accuracy through context. In fact, POS tagging actually doesn't work properly with words by themselves! Look at this example from Ch. 5 of the NLTK book which, given the context in the sentence, lets NLTK differentiate between nouns and verb given homonyms (i.e. given a word like permit that can have different meanings as a verb and a noun):

Let's look at another example, this time including some homonyms:

  >>> text = nltk.word_tokenize("They refuse to permit us to obtain the refuse permit")
  >>> nltk.pos_tag(text)
  [('They', 'PRP'), ('refuse', 'VBP'), ('to', 'TO'), ('permit', 'VB'), ('us', 'PRP'),
  ('to', 'TO'), ('obtain', 'VB'), ('the', 'DT'), ('refuse', 'NN'), ('permit', 'NN')]

Notice that refuse and permit both appear as a present tense verb (VBP) and a noun (NN). E.g. refUSE is a verb meaning "deny," while REFuse is a noun meaning "trash" (i.e. they are not homophones). Thus, we need to know which word is being used in order to pronounce the text correctly. (For this reason, text-to-speech systems usually perform POS-tagging.)

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!