How to implement Unicode string matching by folding in python

后端未结

关注

 5  960

灰色年华 2020-12-13 11:31

I have an application implementing incremental search. I have a catalog of unicode strings to be matched and match them to a given \"key\" string; a catalog string is a \"hi

5条回答

心在旅途 (楼主)

2020-12-13 11:59

For my application, I already addressed this in a different comment: I want to have a unicode result and leave unhandled characters untouched.

In that case, the correct way to do this is to create a UCA collator object with its strength set to compare at primary strength only, which thereby completely disregards diacritics.

I show how to do this using Perl in this answer. The first collator object is at the strength you need, while the second one considers accents for tie-breaking.

You will note that no strings have been harmed in the making of these comparisons: the original data is untouched.

0 讨论(0)

查看其它5个回答
发布评论:

提交评论
- 加载中...