POS-Tagger is incredibly slow

前端 未结 3 1994
忘掉有多难
忘掉有多难 2020-12-10 16:21

I am using nltk to generate n-grams from sentences by first removing given stop words. However, nltk.pos_tag() is extremely slow taking up to 0.6 s

3条回答
  •  予麋鹿
    予麋鹿 (楼主)
    2020-12-10 17:05

    If you are looking for another POS tagger with fast performances in Python, you might want to try RDRPOSTagger. For example, on English POS tagging, the tagging speed is 8K words/second for a single threaded implementation in Python, using a computer of Core 2Duo 2.4GHz. You can get faster tagging speed by simply using the multi-threaded mode. RDRPOSTagger obtains very competitive accuracies in comparison to state-of-the-art taggers and now supports pre-trained models for 40 languages. See experimental results in this paper.

提交回复
热议问题