NSString conversion to lowercase crashes

可紊 提交于 2019-12-13 18:23:56

问题


xcode 4.6 (4H127), xcode 4.6.3 (4H1503): A simple lower/uppercase conversion of a string with an accented char crashes, depending on the setting of Deployment Target. Code snippet:

NSString *lc1 = @"Bosnië-Herzegovina";
NSString *lc2 = [lc1 lowercaseString];
NSString *uc3 = [lc1 uppercaseString];
NSLog( @"\nlc1=%@\nlc2=%@\nuc3=%@ ", lc1,lc2,uc3);

The "ë" is simply typed as "opt-u e", the source code file is regular UTF Unicode.

lc1 looks as expected in the debugger. But, lc2 and uc3 strings have "chinese" characters appended at the end, with Deployment Target < 6.1. With 6.1 selected the chinese characters are gone. All that may simply be the UTF compatibility of the debugger, but with deployment target 5.0-5.1 the code snippet crashes even, as shown below, and that is my problem; the strings in my actual application are not in source code but from an SQLite database. So, at this moment I can only build my app for deployment target 6.0+? Am I missing something?

0x1c49a20:  incl   %eax
0x1c49a21:  jmp    0x1c499fb                 ; CFUniCharMapCaseTo + 1275
0x1c49a23:  movl   12(%ebp), %eax
0x1c49a26:  movw   $105, (%eax)
0x1c49a2b:  movw   $775, 2(%eax)
0x1c49a31:  movl   $2, %eax
0x1c49a36:  jmp    0x1c49dac                 ; CFUniCharMapCaseTo + 2220
0x1c49a3b:  movl   12(%ebp), %eax
0x1c49a3e:  movw   $105, (%eax)
0x1c49a43:  movw   $775, 2(%eax)
0x1c49a49:  movw   $771, 4(%eax)
0x1c49a4f:  movl   $3, %eax
0x1c49a54:  jmp    0x1c49dac                 ; CFUniCharMapCaseTo + 2220
0x1c49a59:  movl   %eax, %edi
0x1c49a5b:  movl   1264482(%edi), %eax
0x1c49a61:  movl   (%eax), %eax
0x1c49a63:  movl   %eax, (%esp)
0x1c49a66:  movl   $0, 8(%esp)
0x1c49a6e:  movl   $48, 4(%esp)
0x1c49a76:  calll  0x1bd9980                 ; CFAllocatorAllocate
0x1c49a7b:  leal   16(%eax), %ecx
0x1c49a7e:  movl   %ecx, 1379418(%edi)
0x1c49a84:  leal   32(%eax), %ecx
0x1c49a87:  movl   %ecx, 1379422(%edi)
0x1c49a8d:  movl   1379410(%edi), %ecx
0x1c49a93:  movl   (%ecx), %ecx  <-- EXC_BAD_ACCESS (code=1,..
0x1c49a95:  movl   (%ecx), %ecx

Edit: I tried minimizing the project to show this problem, and... it disappeared. I have a bit of old-style C-code that uses things like malloc, free, freed, memmove, etc. If this bit is simply present, not even called, the problems described occur. My guess now is that some routines are loaded from a library it should not load from. Digging further.


回答1:


Without exactly answering your question, but attempting to answer as no one else has, it would appear there are no "upper" case associations with those foreign characters.

Could you run a regex, or some kind of string replace to modify all known special characters with a normalized (english) version? Then they would have an uppercase or lowercase conversion.

Of course, this may completely ruin the strings you were reading from the DB if they aren't spelled right.




回答2:


Well, my hunch that there was a problem with loading from libraries, or the order of loading made me change the order of the frameworks included: under "Build Phases" I spotted "CoreText.framework" as one of the last entries. I moved it to the top spot, and now all works fine for all Deployment Targets, 5.0, 5.1, 6.0, 6.1

I actually looked at the loadmap, that you can generate by setting LD_GENERATE_MAP_FILE to yes, to no avail.

Another pointer was supplied by editing the "Scheme" and switching on "Log library loads" and "Log API Usage", in that you can see that stuff is loaded from various libraries, one of them: CoreText.framework

In the end moving CoreText.framework to the top of the list made it all work.

The "chinese" characters you can still see in the debugger when using Deployment Target 5.0-6.0. With 6.1 even they are gone. I guess they fixed that now.



来源:https://stackoverflow.com/questions/17431750/nsstring-conversion-to-lowercase-crashes

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!