For sorting Chinese language text, I want to convert Chinese characters to Pinyin, properly separating each Chinese character and grouping successive characters together.
You can use the following method:
from __future__ import unicode_literals from pypinyin import lazy_pinyin hanzi_list = ['如何', '将', '汉字','转为', '拼音'] pinyin_list = [''.join(lazy_pinyin(_)) for _ in hanzi_list]
Output:
['ruhe', 'jiang', 'hanzi', 'zhuanwei', 'pinyin']