Cyrillic alphabet validation

孤者浪人 提交于 2019-12-05 00:42:26

You have to use Unicode regex . for example \p{L}+ for any unicode letter. For more look in the java doc for java.util.Pattern there is section called unicode support. Also, there are details here: link

In my case I have to check whether it's a name written in Russian.

I've ended up with this:

private static final String ruNameRegEx = "[А-ЯЁ][-А-яЁё]+";

and for the full name:

private static final String ruNamePart = "[А-яЁё][-А-яЁё]+";
private static final String ruFullNameRegEx = "\\s*[А-ЯЁ][-А-яЁё]+\\s+(" + ruNamePart + "\\s+){1,5}" + ruNamePart + "\\s*";)";

The last one covers some complex cases:

public class Test {
    Pattern ruFullNamePattern = Pattern.compile(ruFullNameRegEx);

    @Test
    public void test1() {
        assertTrue(isRuFullName("Иванов Василий Иванович"));
    }

    @Test
    public void test2() {
        assertTrue(isRuFullName(" Иванов Василий Акимович "));
    }

    @Test
    public void test3() {
        assertTrue(isRuFullName("Ёлкин Василий Иванович"));
    }

    @Test
    public void test4() {
        assertTrue(isRuFullName("Иванов Василий Аксёнович"));
    }

    @Test
    public void test5() {
        assertFalse(isRuFullName("иванов василий акимович"));
    }

    @Test
    public void test6() {
        assertFalse(isRuFullName("Иванов С.В."));
    }

    @Test
    public void test7() {
        assertTrue(isRuFullName("Мамин-Сибиряк Анна-Мария Иоановна"));
    }

    @Test
    public void test8() {
        assertTrue(isRuFullName("Хаджа Насредин Махмуд-Азгы-Бек"));
    }

    @Test
    public void test9() {
        assertTrue(isRuFullName("Хаджа Насредин ибн Махмуд"));
    }

    private boolean isRuFullName(String testString) {
        Matcher m = ruFullNamePattern.matcher(testString);
        return m.matches();
    }
}
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!