Can sorting Japanese kanji words be done programmatically?

后端 未结 4 1479
春和景丽
春和景丽 2020-12-14 09:19

I\'ve recently discovered, to my astonishment (having never really thought about it before), machine-sorting Japanese proper nouns is apparently not possible.

I work

4条回答
  •  悲哀的现实
    2020-12-14 09:34

    Nice to hear people are working with Japanese.

    I think you're spot on with your assessment of the problem difficulty. I just asked one of the Japanese guys in my lab, and the way to do it seems to be as you describe:

    1. Take a list of Kanji
    2. Infer (guess) the yomigana
    3. Sort yomigana by gojuon.

    The hard part is obviously step two. I have two guys in my lab: 高橋 and 高谷. Naturally, when sorting reports etc. by name they appear nowhere near each other.

    EDIT

    If you're fluent in Japanese, have a look here: http://mecab.sourceforge.net/

    It's a pretty popular tool, so you should be able to find English documentation too (the man page for mecab has English info).

提交回复
热议问题