How to correctly compute the length of a String in Java?

后端 未结 5 825
佛祖请我去吃肉
佛祖请我去吃肉 2020-12-08 02:54

I know there is String#length and the various methods in Character which more or less work on code units/code points.

What is the suggested

5条回答
  •  予麋鹿
    予麋鹿 (楼主)
    2020-12-08 03:48

    If you mean, counting the length of a string according to the grammatical rules of a language, then the answer is no, there's no such algorithm in Java, nor anywhere else.

    Not unless the algorithm also does a full semantic analysis of the text.

    In Hungarian for example sz and zs can count as one letter or two, which depends on the composition of the word they appear in. (E.g.: ország is 5 letters, whereas torzság is 7.)

    Uodate: If all you want is the Unicode standard character count (which, as I pointed out, isn't accurate), transforming your string to the NFKC form with java.text.Normalizer could be a solution.

提交回复
热议问题