Calculating length in UTF-8 of Java String without actually encoding it

后端 未结 4 843
你的背包
你的背包 2020-12-03 04:45

Does anyone know if the standard Java library (any version) provides a means of calculating the length of the binary encoding of a string (specifically UTF-8 in this case) w

4条回答
  •  不知归路
    2020-12-03 05:00

    You can loop thru the String:

    /**
     * Deprecated: doesn't support surrogate characters.
     */
    @Deprecated
    public int countUTF8Length(String str)
    {
        int count = 0;
        for (int i = 0; i < str.length(); ++i)
        {
            char c = str.charAt(i);
            if (c < 0x80)
            {
                count++;
            } else if (c < 0x800)
            {
                count +=2;
            } else
                throw new UnsupportedOperationException("not implemented yet");
            }
        }
        return count;
    }
    

提交回复
热议问题