use string.translate in Python to transliterate Cyrillic?

前端 未结 3 553
星月不相逢
星月不相逢 2020-12-29 07:44

I\'m getting UnicodeEncodeError: \'ascii\' codec can\'t encode characters in position 0-51: ordinal not in range(128) exception trying to use string.maket

3条回答
  •  醉酒成梦
    2020-12-29 08:12

    translate behaves differently when used with unicode strings. Instead of a maketrans table, you have to provide a dictionary ord(search)->ord(replace):

    symbols = (u"абвгдеёжзийклмнопрстуфхцчшщъыьэюяАБВГДЕЁЖЗИЙКЛМНОПРСТУФХЦЧШЩЪЫЬЭЮЯ",
               u"abvgdeejzijklmnoprstufhzcss_y_euaABVGDEEJZIJKLMNOPRSTUFHZCSS_Y_EUA")
    
    tr = {ord(a):ord(b) for a, b in zip(*symbols)}
    
    # for Python 2.*:
    # tr = dict( [ (ord(a), ord(b)) for (a, b) in zip(*symbols) ] )
    
    text = u'Добрый Ден'
    print text.translate(tr)  # looks good
    

    That said, I'd second the suggestion not to reinvent the wheel and to use an established library: http://pypi.python.org/pypi/Unidecode

提交回复
热议问题