How do I reverse Unicode decomposition using Python?

前端 未结 3 1872
隐瞒了意图╮
隐瞒了意图╮ 2020-12-16 05:58

Using Python 2.5, I have some text in stored in a unicode object:

Dinis e Isabel, uma difı´cil relac¸a˜o conjugal e polı´tica

3条回答
  •  既然无缘
    2020-12-16 06:21

    I can't really give you a definitive answer to your question because I never tried that. But there is a unicodedata module in the standard library. It has two functions decomposition() and normalize() that might help you here.

    Edit: Make sure that it really is decomposed unicode. Sometimes there are weird ways to write characters that can't be directly expressed in an encoding. Like "a which is meant to be mentally parsed by a human or some specialized program as ä.

提交回复
热议问题