Detect Chinese character in java

后端 未结 3 760
佛祖请我去吃肉
佛祖请我去吃肉 2020-12-13 15:40

Using Java how to detect if a String contains Chinese characters?

    String chineseStr = \"已下架\" ;

if (isChineseString(chineseStr)) {
  System.out.println(         


        
相关标签:
3条回答
  • 2020-12-13 16:12

    You can try with Google API or Language Detection API

    Language Detection API contains simple demo. You can try it first.

    0 讨论(0)
  • 2020-12-13 16:14

    A more direct approach:

    if ("粽子".matches("[\\u4E00-\\u9FA5]+")) {
        System.out.println("is Chinese");
    }
    

    If you also need to catch rarely used and exotic characters then you'll need to add all the ranges: What's the complete range for Chinese characters in Unicode?

    0 讨论(0)
  • 2020-12-13 16:18

    Now Character.isIdeographic(int codepoint) would tell wether the codepoint is a CJKV (Chinese, Japanese, Korean and Vietnamese) ideograph.

    Nearer is using Character.UnicodeScript.HAN.

    So:

    System.out.println(containsHanScript("xxx已下架xxx"));
    
    public static boolean containsHanScript(String s) {
        for (int i = 0; i < s.length(); ) {
            int codepoint = s.codePointAt(i);
            i += Character.charCount(codepoint);
            if (Character.UnicodeScript.of(codepoint) == Character.UnicodeScript.HAN) {
                return true;
            }
        }
        return false;
    }
    

    Or in java 8:

    public static boolean containsHanScript(String s) {
        return s.codePoints().anyMatch(
                codepoint ->
                Character.UnicodeScript.of(codepoint) == Character.UnicodeScript.HAN);
    }
    
    0 讨论(0)
提交回复
热议问题