Unicode Tagging in Python NLTK
问题 I am working on a python NLTK tagging program. My input file is Hindi text containing several lines. On tokenizing the text and using pos_tag the output I get is with NN tag only. but with English sentence as input it does proper tagging. Kindly Help. Version - Python 3.4.1, from NLTK 3.0 documentation Kindly help! here is what I tried. word_to_be_tagged = u"ताजो स्वास आनी चकचकीत दांत तुमचें व्यक्तीमत्व परजळायतात." from nltk.corpus import indian train_data = indian.tagged_sents('hindi.pos')[