问题
Strings in 2.0 no longer conform to CollectionType. Each character in the String is now an Extended Graphene Cluster.
Without digging too deep about the Cluster stuff, I tried a few things with Swift Strings:
String now has a characters
property that contains what we humans recognize as characters. Each distinct character in the string is considered a character, and the count
property gives us the number of distinct characters.
What I don't quite understand is, even though the characters
count shows 10
, why does the index
show emojis occupying 2 indexes?
回答1:
The index
of a String
is no more related to the number of characters (count
) in Swift 2.0. It is an “opaque” struct
(defined as CharacterView.Index
) used only to iterate through the characters of a string. So even if it is printed as an integer, it should not be considered or used as an integer, to which, for instance, you can sum 2 to get the second character from the current one. What you can do is only to apply the two methods predecessor
and successor
to get the previous or successive index in the String
. So, for instance, to get the second character from that with index idx
in mixedString
you can do:
mixedString[idx.successor().successor()]
Of course you can use more confortable ways of reading the characters of string, like for instance, the for
statement or the global function indices(_:)
.
Consider that the main benefit of this approach is not to the threat multi-bytes characters in Unicode strings, as emoticons, but rather to treat in a uniform way identical (for us humans!) strings that can have multiple representations in Unicode, as different set of “scalars”, or characters. An example is café
, that can be represented either with four Unicode “scalars” (unicode characters), or with five Unicode scalars. And note that this is a completely different thing from Unicode representations like UTF-8, UTF-16, etc., that are ways of mapping Unicode scalars into memory bytes.
回答2:
An Extended Graphene Cluster can still occupy multiple bytes, however, the correct way to determine the index position of a character would be:
let mixed = ("MADE IN THE USA 🇺🇸");
var index = mixed.rangeOfString("🇺🇸")
var intIndex: Int = distance(mixed.startIndex, index!.startIndex)
Result:
16
The way you are trying to get the index would normally be meant for an array, and I think Swift cannot properly work that out with your mixedString
.
来源:https://stackoverflow.com/questions/32164218/swift-2-0-string-behavior