Splitting a string on a non-printable character in Swift

一曲冷凌霜 提交于 2019-12-13 03:00:43

问题


I'm attempting to split a string I've read from a barcode into an array in Swift and I'm getting somewhat lost in the discussion of codepoints, unicode scalars and grapheme clusters...

The barcode string contains "FNC1" delimiters which I believe has either an ASCII value of 232 or of 29 (I've found conflicting documentation), so the string is of this form:

FNC1019931265099999891T77FNC1203000FNC19247

I'd expect the correct array split output to be: ["019931265099999891T77", "1203000", "19247"]

I've tried an approach like this:

var codeArray = barcodeString.componentsSeparatedByString("\u{232}") and var codeArray = barcodeString.componentsSeparatedByString("\u{29}")

But neither "\u{232}" or "\u{29}" are being found so either my syntax is wrong or the ascii value of FNC1 is incorrect.

If I loop through the barcodeString printing the utf8 values for each character the FNC1 character displays as if it were the integer 29, however I believe this is a codepoint not an integer - I certainly can't do an integer based comparison to detect it, that gives a compiler error.

What would be the correct way to work out how this character is represented in a Swift string and to compare/split against it?

Update The problem boils down to how to find the ascii code value from a single character and how to go the other way, generating a character if you have an integer ascii code value.

I've posted my hacky solution to this as an answer but there must be a neater, more robust way to do it.


回答1:


So the best I've come up with is to loop through the string looking at each character, converting each individual character to a string so I can then get a value for it.

As I can't find a way to get the ascii value of a character directly each character in turn has to be cast to a string then the unicodeScalars property lets me access the values that represent the string elements, these values are UInt32 so they can be compared to the integer value of the non-printable character with a bit of typecasting.

Messy but so far the only answer I've found.

    func barcodeStringToArray(inputString: String, asciiValue: Int, splitString: String) -> Array<String>? {
        var results = [""]
        var replacedString = ""

        for myChar in inputString {
            let tmpString: String = String(myChar)
            for myChar in tmpString.unicodeScalars {
                if myChar.value == UInt32(asciiValue) {
                    replacedString += splitString
                } else {
                    replacedString += "\(myChar)"
                }
                //Can there ever be more than one element in this array?
                //Does an extended grapheme clusters come up as multiple elements?
                break
            }
        }
        results = replacedString.componentsSeparatedByString(splitString)
        //Now remove any empty arrays
        results = results.filter({$0 != ""})
        return results
    }



回答2:


I found an Interesting case, that

method 1

var data:[String] = split( featureData ) { $0 == "\u{003B}" }

When I used this command to split some symbol from the data that loaded from server, it can split while test in simulator and sync with test device, but it won't split in publish app, and Ad Hoc

It take me a lot of time to track this error, It might cursed from some Swift Version, or some iOS Version or neither

It's not about the HTML code also, since I try to stringByRemovingPercentEncoding and it's still not work


method 2

var data:[String] = featureData.componentsSeparatedByString("\u{003B}")

When I used this command, it can split the same data that load from server correctly


Conclusion, I really suggest to use the method 2

string.componentsSeparatedByString("")



回答3:


Swift 4.

extension String {
    func removingAllInstancesOfChar(character: UInt32) -> String {

        var returnString = String()

        for myChar in self {
            let tmpString: String = String(myChar)
            for myChar in tmpString.unicodeScalars {
                if myChar.value != UInt32(character) {
                    returnString += "\(myChar)"
                }
            }
        }

        return returnString
    }

    func replaceAllInstancesOfChar(character: UInt32, replacement: String) -> String {
        var replacedString = ""

        for myChar in self {
            let tmpString: String = String(myChar)
            for myChar in tmpString.unicodeScalars {
                if myChar.value == UInt32(character) {
                    replacedString += replacement
                } else {
                    replacedString += "\(myChar)"
                }
                break
            }
        }
        return replacedString
    }
}

Updated @benz001 code a bit to just process the string with unicode characters...replace and remove...either or.

so...

inputString.replaceAllInstancesOfChar(character: 29, replacement: "|") // separators
inputString.removingAllInstancesOfChar(character: 30) // start/stop byte


来源:https://stackoverflow.com/questions/25973153/splitting-a-string-on-a-non-printable-character-in-swift

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!