NLTK Tagging spanish words using a corpus

前端 未结 4 849
無奈伤痛
無奈伤痛 2020-11-27 17:13

I am trying to learn how to tag spanish words using NLTK.

From the nltk book, It is quite easy to tag english words using their example. Because I am new to nltk an

4条回答
  •  甜味超标
    2020-11-27 17:59

    I ended up here searching for POS taggers for other languages then English. Another option for your problem is using the Spacy library. Which offers POS tagging for multiple languages such as Dutch, German, French, Portuguese, Spanish, Norwegian, Italian, Greek and Lithuanian.

    From the Spacy Documentation:

    import es_core_news_sm
    nlp = es_core_news_sm.load()
    doc = nlp("El copal se usa principalmente para sahumar en distintas ocasiones como lo son las fiestas religiosas.")
    print([(w.text, w.pos_) for w in doc])
    

    leads to:

    [('El', 'DET'), ('copal', 'NOUN'), ('se', 'PRON'), ('usa', 'VERB'), ('principalmente', 'ADV'), ('para', 'ADP'), ('sahumar', 'VERB'), ('en', 'ADP'), ('distintas', 'DET'), ('ocasiones', 'NOUN'), ('como', 'SCONJ'), ('lo', 'PRON'), ('son', 'AUX'), ('las', 'DET'), ('fiestas', 'NOUN'), ('religiosas', 'ADJ'), ('.', 'PUNCT')]

    and to visualize in a notebook:

    displacy.render(doc, style='dep', jupyter = True, options = {'distance': 120})
    

提交回复
热议问题