How to convert Chinese characters to Pinyin

后端 未结 8 959
渐次进展
渐次进展 2020-12-23 15:10

For sorting Chinese language text, I want to convert Chinese characters to Pinyin, properly separating each Chinese character and grouping successive characters together.

8条回答
  •  执笔经年
    2020-12-23 15:35

    Short answer: you don't.

    Long answer: There is no one-to-one mapping for 汉字 to 汉语拼音. Just some quick examples:

    • 把 can be "ba" in the third tone or fourth tone.
    • 了 can be "le" toneless or "liao" third tone.
    • 乐 can be "le" or "yue", both in the fourth tone.
    • 落 can be "luo", "la" or "lao", all in the fourth tone.

    And so on. I have a beginners' book on this topic that has 207 examples. I stress that this is a beginners' book and is by no means complete. Each one has a page or two of examples of use and conditions under which you choose the appropriate pronunciation. It is not something that could be easily programmed (if at all).

    And this doesn't even address the other slippery thing you want to deal with: the separation of characters into grouped words. The very notion of a word is a bit slippery in Chinese. (There's two terms that correspond, roughly to "word" in Chinese for example: 字 and 词. The first is the character, the second groups of characters that are put together into one concept. (I frequently get asked by Chinese speakers how many "words" I can read when they really mean "characters".) While in some cases the distinction is clear (the 词 "乌鸦", for example, is "crow" -- the two 字 must be together to express the idea properly and it would be incorrect to translate it as "black crow"), in others it is not so clear. What does "你好" translate to? Is it one word meaning, idiomatically, "hello"? Or is it two words translating literally to "you good"? Each of the characters involved stands alone or in groups with other words, but together they mean something entirely different from their individual meanings. Given this, how, precisely, do you plan to group the 汉语拼音 transliterations (which are difficult to impossible to get right in the first place!) into "words"?

提交回复
热议问题