问题
I'm attempting to split a string I've read from a barcode into an array in Swift and I'm getting somewhat lost in the discussion of codepoints, unicode scalars and grapheme clusters...
The barcode string contains "FNC1" delimiters which I believe has either an ASCII value of 232 or of 29 (I've found conflicting documentation), so the string is of this form:
FNC1019931265099999891T77FNC1203000FNC19247
I'd expect the correct array split output to be:
["019931265099999891T77", "1203000", "19247"]
I've tried an approach like this:
var codeArray = barcodeString.componentsSeparatedByString("\u{232}")
and
var codeArray = barcodeString.componentsSeparatedByString("\u{29}")
But neither "\u{232}"
or "\u{29}"
are being found so either my syntax is wrong or the ascii value of FNC1 is incorrect.
If I loop through the barcodeString printing the utf8 values for each character the FNC1 character displays as if it were the integer 29, however I believe this is a codepoint not an integer - I certainly can't do an integer based comparison to detect it, that gives a compiler error.
What would be the correct way to work out how this character is represented in a Swift string and to compare/split against it?
Update The problem boils down to how to find the ascii code value from a single character and how to go the other way, generating a character if you have an integer ascii code value.
I've posted my hacky solution to this as an answer but there must be a neater, more robust way to do it.
回答1:
So the best I've come up with is to loop through the string looking at each character, converting each individual character to a string so I can then get a value for it.
As I can't find a way to get the ascii value of a character directly each character in turn has to be cast to a string then the unicodeScalars property lets me access the values that represent the string elements, these values are UInt32 so they can be compared to the integer value of the non-printable character with a bit of typecasting.
Messy but so far the only answer I've found.
func barcodeStringToArray(inputString: String, asciiValue: Int, splitString: String) -> Array<String>? {
var results = [""]
var replacedString = ""
for myChar in inputString {
let tmpString: String = String(myChar)
for myChar in tmpString.unicodeScalars {
if myChar.value == UInt32(asciiValue) {
replacedString += splitString
} else {
replacedString += "\(myChar)"
}
//Can there ever be more than one element in this array?
//Does an extended grapheme clusters come up as multiple elements?
break
}
}
results = replacedString.componentsSeparatedByString(splitString)
//Now remove any empty arrays
results = results.filter({$0 != ""})
return results
}
回答2:
I found an Interesting case, that
method 1
var data:[String] = split( featureData ) { $0 == "\u{003B}" }
When I used this command to split some symbol from the data that loaded from server, it can split while test in simulator and sync with test device, but it won't split in publish app, and Ad Hoc
It take me a lot of time to track this error, It might cursed from some Swift Version, or some iOS Version or neither
It's not about the HTML code also, since I try to stringByRemovingPercentEncoding and it's still not work
method 2
var data:[String] = featureData.componentsSeparatedByString("\u{003B}")
When I used this command, it can split the same data that load from server correctly
Conclusion, I really suggest to use the method 2
string.componentsSeparatedByString("")
回答3:
Swift 4.
extension String {
func removingAllInstancesOfChar(character: UInt32) -> String {
var returnString = String()
for myChar in self {
let tmpString: String = String(myChar)
for myChar in tmpString.unicodeScalars {
if myChar.value != UInt32(character) {
returnString += "\(myChar)"
}
}
}
return returnString
}
func replaceAllInstancesOfChar(character: UInt32, replacement: String) -> String {
var replacedString = ""
for myChar in self {
let tmpString: String = String(myChar)
for myChar in tmpString.unicodeScalars {
if myChar.value == UInt32(character) {
replacedString += replacement
} else {
replacedString += "\(myChar)"
}
break
}
}
return replacedString
}
}
Updated @benz001 code a bit to just process the string with unicode characters...replace and remove...either or.
so...
inputString.replaceAllInstancesOfChar(character: 29, replacement: "|") // separators
inputString.removingAllInstancesOfChar(character: 30) // start/stop byte
来源:https://stackoverflow.com/questions/25973153/splitting-a-string-on-a-non-printable-character-in-swift