NSLog() vs printf() when printing C string (UTF-8)

只愿长相守 提交于 2019-12-07 12:18:09

问题


I have noticed that if I try to print the byte array containing the representation of a string in UTF-8, using the format specifier "%s", printf() gets it right but NSLog() gets it garbled (i.e., each byte printed as-is, so for example "¥" gets printed as the 2 characters: "¬•"). This is curious, because I always thought that NSLog() is just printf(), plus:

  1. The first parameter (the 'format') is an Objective-C string, not a C string (hence the "@").
  2. The timestamp and app name prepended.
  3. The newline automatically added at the end.
  4. The ability to print Objective-C objects (using the format "%@").

My code:

NSString* string; 

// (...fill string with unicode string...)

const char* stringBytes = [string cStringUsingEncoding:NSUTF8Encoding];

NSUInteger stringByteLength = [string lengthOfBytesUsingEncoding:NSUTF8Encoding];
stringByteLength += 1; // add room for '\0' terminator

char* buffer = calloc(sizeof(char), stringByteLength);

memcpy(buffer, stringBytes, stringByteLength);

NSLog(@"Buffer after copy: %s", buffer);
// (renders ascii, no matter what)

printf("Buffer after copy: %s\n", buffer);
// (renders correctly, e.g. japanese text)

Somehow, it looks as if printf() is "smarter" than NSLog(). Does anyone know the underlying cause, and if this feature is documented anywhere? (Couldn't find)


回答1:


NSLog() and stringWithFormat: seem to expect the string for %s in the "system encoding" (for example "Mac Roman" on my computer):

NSString *string = @"¥";
NSStringEncoding enc = CFStringConvertEncodingToNSStringEncoding(CFStringGetSystemEncoding());
const char* stringBytes = [string cStringUsingEncoding:enc];
NSString *log = [NSString stringWithFormat:@"%s", stringBytes];
NSLog(@"%@", log);

// Output: ¥

Of course this will fail if some characters are not representable in the system encoding. I could not find an official documentation for this behavior, but one can see that using %s in stringWithFormat: or NSLog() does not reliably work with arbitrary UTF-8 strings.

If you want to check the contents of a char buffer containing an UTF-8 string, then this would work with arbitrary characters (using the boxed expression syntax to create an NSString from a UTF-8 string):

NSLog(@"%@", @(utf8Buffer));


来源:https://stackoverflow.com/questions/23671994/nslog-vs-printf-when-printing-c-string-utf-8

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!