Accent in Regular Expression in Java

て烟熏妆下的殇ゞ 提交于 2019-12-02 19:17:46

The Java regex documentation has a section on Unicode categories (search for "Classes for Unicode blocks and categories"). If you're just looking for letters, I think \p{L} is the category you want.

I had more luck with:

\p{InCombiningDiacriticalMarks}+

In java I use the following method:

import java.text.Normalizer;
import java.text.Normalizer.Form;

public static String removeAccents(String text) {
    return text == null ? null :
        Normalizer.normalize(text, Form.NFD)
            .replaceAll("\\p{InCombiningDiacriticalMarks}+", "");
}
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!