Determine if characters in a string are all of a specific character set

前端 未结 2 1817
小鲜肉
小鲜肉 2020-12-05 19:12

I need to be able to take a string in Java and determine whether or not all of the characters contained within it are in a specified character set (e.g. ISO-8859-1). I\'ve

相关标签:
2条回答
  • 2020-12-05 19:36

    I think that the easiest way will be to have a table of which Unicode characters can be represented in the target character set encoding and then testing each character in the string. For the ISO-8859 family, the table can usually be represented by one or a few ranges of Unicode characters, making the test relatively easy. It's a lot of hand work, but needs to be done only once.

    EDIT: or use Aubin's answer if the charset is supported in your Java implementation. :)

    0 讨论(0)
  • 2020-12-05 19:40

    Class CharsetEncoder in package java.nio.charset offer a method canEncode to test if a specific character is supported.

    Michael basically did something like this:

    Charset.forName( CharEncoding.ISO_8859_1 ).newEncoder().canEncode("string")

    Note that CharEncoding.ISO_8859_1 rely on Apache commons and may be replaced by "ISO_8859_1".

    0 讨论(0)
提交回复
热议问题