StringTokenizer
? Convert the String
to a char[]
and iterate over that? Something else?
In Java 8 we can solve it as:
String str = "xyz";
str.chars().forEachOrdered(i -> System.out.print((char)i));
str.codePoints().forEachOrdered(i -> System.out.print((char)i));
The method chars() returns an IntStream
as mentioned in doc:
Returns a stream of int zero-extending the char values from this sequence. Any char which maps to a surrogate code point is passed through uninterpreted. If the sequence is mutated while the stream is being read, the result is undefined.
The method codePoints()
also returns an IntStream
as per doc:
Returns a stream of code point values from this sequence. Any surrogate pairs encountered in the sequence are combined as if by Character.toCodePoint and the result is passed to the stream. Any other code units, including ordinary BMP characters, unpaired surrogates, and undefined code units, are zero-extended to int values which are then passed to the stream.
How is char and code point different? As mentioned in this article:
Unicode 3.1 added supplementary characters, bringing the total number of characters to more than the 2^16 = 65536 characters that can be distinguished by a single 16-bit
char
. Therefore, achar
value no longer has a one-to-one mapping to the fundamental semantic unit in Unicode. JDK 5 was updated to support the larger set of character values. Instead of changing the definition of thechar
type, some of the new supplementary characters are represented by a surrogate pair of twochar
values. To reduce naming confusion, a code point will be used to refer to the number that represents a particular Unicode character, including supplementary ones.
Finally why forEachOrdered
and not forEach
?
The behaviour of forEach
is explicitly nondeterministic where as the forEachOrdered
performs an action for each element of this stream, in the encounter order of the stream if the stream has a defined encounter order. So forEach
does not guarantee that the order would be kept. Also check this question for more.
For difference between a character, a code point, a glyph and a grapheme check this question.