Python not sorting unicode properly. Strcoll doesn't help

后端 未结 6 1029
执笔经年
执笔经年 2020-11-30 04:05

I\'ve got a problem with sorting lists using unicode collation in Python 2.5.1 and 2.6.5 on OSX, as well as on Linux.

import locale   
locale.setlocale(loca         


        
6条回答
  •  难免孤独
    2020-11-30 04:46

    @gnibbler, using PyICU with the sorted() function does work in a Python3 Environment. After a little digging through the ICU API documentation and some experimentation, I came across the getSortKey() function:

    import PyICU
    collator = PyICU.Collator.createInstance(PyICU.Locale('de_DE.UTF-8'))
    sorted(['a','b','c','ä'],key=collator.getSortKey)
    

    which produces the desired collation:

    ['a', 'ä', 'b', 'c']
    

    instead of the undesired collation:

    sorted(['a','b','c','ä'])
    ['a', 'b', 'c', 'ä']
    

提交回复
热议问题