I am trying to solve this interview question.
After given clearly definition of UTF-8 format. ex: 1-byte : 0b0xxxxxxx 2- bytes:.... Asked to wri
the CharsetDecoder
might be what you are looking for:
@Test
public void testUTF8() throws CharacterCodingException {
// the desired charset
final Charset UTF8 = Charset.forName("UTF-8");
// prepare decoder
final CharsetDecoder decoder = UTF8.newDecoder();
decoder.onMalformedInput(CodingErrorAction.REPORT);
decoder.onUnmappableCharacter(CodingErrorAction.REPORT);
byte[] bytes = new byte[48];
new Random().nextBytes(bytes);
ByteBuffer buffer = ByteBuffer.wrap(bytes);
try {
decoder.decode(buffer);
fail("Should not be UTF-8");
} catch (final CharacterCodingException e) {
// noop, the test should fail here
}
final String string = "hallo welt!";
bytes = string.getBytes(UTF8);
buffer = ByteBuffer.wrap(bytes);
final String result = decoder.decode(buffer).toString();
assertEquals(string, result);
}
so your function might look like that:
public static boolean checkEncoding(final byte[] bytes, final String encoding) {
final CharsetDecoder decoder = Charset.forName(encoding).newDecoder();
decoder.onMalformedInput(CodingErrorAction.REPORT);
decoder.onUnmappableCharacter(CodingErrorAction.REPORT);
final ByteBuffer buffer = ByteBuffer.wrap(bytes);
try {
decoder.decode(buffer);
return true;
} catch (final CharacterCodingException e) {
return false;
}
}