Determine if characters in a string are all of a specific character set

吃可爱长大的小学妹 提交于 2019-12-28 05:59:09

问题


I need to be able to take a string in Java and determine whether or not all of the characters contained within it are in a specified character set (e.g. ISO-8859-1). I've looked around quite a bit for a simple way to do this (including playing around with a CharsetDecoder), but have yet to be able to find something.

What is the best way to take a string and determine if all the characters are within a given character set?


回答1:


Class CharsetEncoder in package java.nio.charset offer a method canEncode to test if a specific character is supported.

Michael basically did something like this:

Charset.forName( CharEncoding.ISO_8859_1 ).newEncoder().canEncode("string")

Note that CharEncoding.ISO_8859_1 rely on Apache commons and may be replaced by "ISO_8859_1".




回答2:


I think that the easiest way will be to have a table of which Unicode characters can be represented in the target character set encoding and then testing each character in the string. For the ISO-8859 family, the table can usually be represented by one or a few ranges of Unicode characters, making the test relatively easy. It's a lot of hand work, but needs to be done only once.

EDIT: or use Aubin's answer if the charset is supported in your Java implementation. :)



来源:https://stackoverflow.com/questions/13144250/determine-if-characters-in-a-string-are-all-of-a-specific-character-set

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!