What is the equivalent of stringByFoldingWithOptions:locale: in Java?

久未见 提交于 2019-12-17 20:54:49

问题


I am looking for the way to normalise the list of titles. The title is normalized to be stored in a database as a sort and look up key. "Normalize" means many things such as converting to lowercase, removing the roman accent character, or removing preceding "the", "a" or "an".

In iOS or Mac, NSString class has stringByFoldingWithOptions:locale: method to get the folding version of string.

NSString Class Reference - stringByFoldingWithOptions:locale:

In Java, java.uril.Collator class seems to be useful for comparing, but there seems no way to convert for such purpose.


回答1:


You can use java.text.Normalizer which comes close to normalizing Strings in Java. Though regex are also a powerful way to manipulate the Strings in whichever way possible.

Example of accent removal:

String accented = "árvíztűrő tükörfúrógép";
String normalized = Normalizer.normalize(accented,  Normalizer.Form.NFD);
normalized = normalized.replaceAll("[^\\p{ASCII}]", "");

System.out.println(normalized);

Output:

arvizturo tukorfurogep

More explanation here: http://docs.oracle.com/javase/tutorial/i18n/text/normalizerapi.html



来源:https://stackoverflow.com/questions/21489289/what-is-the-equivalent-of-stringbyfoldingwithoptionslocale-in-java

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!