Why is the return value of String.addingPercentEncoding() optional?

喜欢而已 提交于 2019-11-27 04:29:24

问题


The signature of the String method for percent-escaping is:

func addingPercentEncoding(withAllowedCharacters: CharacterSet)
    -> String?

(This was stringByAddingPercentEncodingWithAllowedCharacters in Swift 2.)

Why does this method return an optional?

The documentation says that the method returns nil “if the transformation is not possible,” but it's unclear under what circumstances the escaping transformation could fail:

  • Characters are escaped using UTF-8, which is a complete Unicode encoding. Any valid Unicode character can be encoded using UTF-8, and thus can be escaped.

  • I thought perhaps the method applied some kind of sanity check for bad interactions between the set of allowed chars and the chars used for escaping, but this is not the case: the method succeeds no matter whether the set of allowed chars contains "%", and also succeeds if the allowed char set is empty.

As it stands, the non-optional return value appear to be forcing a nonsensical error check.


回答1:


I filed a bug report with Apple about this, and heard back — with a very helpful response, no less!

Turns out (much to my surprise) that it’s possible to successfully create Swift strings that contain invalid Unicode in the form of unpaired UTF-16 surrogate chars. Such a string can cause UTF-8 encoding to fail. Here’s some code that illustrates this behavior:

// Succeeds (wat?!):
let str = String(
    bytes: [0xD8, 0x00] as [UInt8],
    encoding: String.Encoding.utf16BigEndian)!

// Returns nil:
str.addingPercentEncoding(withAllowedCharacters:
    CharacterSet.alphanumerics)



回答2:


Based on Paul Cantrell answer, small demonstration that it's also possible for the same method to also return null in Objective-C, despite String and NSString being different beasts when it comes to encodings:

uint8_t bytes[2] = { 0xD8, 0x00 };
NSString *string = [[NSString alloc] initWithBytes:bytes length:2 encoding:NSUTF16BigEndianStringEncoding];
// \ud800
NSLog(@"%@", string);

NSString *escapedString = [string stringByAddingPercentEncodingWithAllowedCharacters:NSCharacterSet.URLHostAllowedCharacterSet];
// (null)
NSLog(@"%@", escapedString);

For fun, https://r12a.github.io/app-conversion/ will percent escape the same as:

Error%20in%20convertUTF162Char%3A%20low%20surrogate%20expected%2C%20b%3D0%21%00



来源:https://stackoverflow.com/questions/33558933/why-is-the-return-value-of-string-addingpercentencoding-optional

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!