How to determine a string is english or arabic?

后端 未结 8 1266
我寻月下人不归
我寻月下人不归 2021-01-31 17:47

Is there a way to determine a string is English or Arabic?

8条回答
  •  轮回少年
    2021-01-31 18:00

    English characters tend to be in these 4 Unicode blocks:

    • BASIC_LATIN
    • LATIN_1_SUPPLEMENT
    • LATIN_EXTENDED_A
    • GENERAL_PUNCTUATION

      public static boolean isEnglish(String text) {
      
       boolean onlyEnglish = false;
      
       for (char character : text.toCharArray()) {
      
          if (Character.UnicodeBlock.of(character) == Character.UnicodeBlock.BASIC_LATIN
                  || Character.UnicodeBlock.of(character) == Character.UnicodeBlock.LATIN_1_SUPPLEMENT
                  || Character.UnicodeBlock.of(character) == Character.UnicodeBlock.LATIN_EXTENDED_A
                  || Character.UnicodeBlock.of(character) == Character.UnicodeBlock.GENERAL_PUNCTUATION) {
      
              onlyEnglish = true;
          } else {
      
              onlyEnglish = false;
          }
       }
      
        return onlyEnglish;
      }
      

提交回复
热议问题