NSString to treat “regular english alphabets” and characters like emoji or japanese uniformly

Deadly 提交于 2019-12-18 05:20:12

问题


There is a textView in which I can enter Characters. characters can be a,b,c,d etc or a smiley face added using emoji keyboard.

-(void)textFieldDidEndEditing:(UITextField *)textField{
    NSLog(@"len:%lu",textField.length);
    NSLog(@"char:%c",[textField.text characterAtIndex:0]);
}

Currently , The above function gives following outputs

if textField.text = @"qq"
len:2
char:q

if textField.text = @"😄q"
len:3
char:=

What I need is

if textField.text = @"qq"
len:2
char:q

if textField.text = @"😄q"
len:2
char:😄

Any clue how to do this ?


回答1:


Since Apple screwed up emoji (actually Unicode planes above 0) this becomes difficult. It seems it is necessary to enumerate through the composed character to get the actual length.

Note: The NSString method length does not return the number of characters but the number of code units (not characters) in unichars. See NSString and Unicode - Strings - objc.io issue #9.

Example code:

NSString *text = @"qqq😄rrr";
int maxCharacters = 4;

__block NSInteger unicharCount = 0;
__block NSInteger charCount = 0;
[text enumerateSubstringsInRange:NSMakeRange(0, text.length)
                         options:NSStringEnumerationByComposedCharacterSequences
                      usingBlock:^(NSString *substring, NSRange substringRange, NSRange enclosingRange, BOOL *stop) {
                          unicharCount += substringRange.length;
                          if (++charCount >= maxCharacters)
                              *stop = YES;
                      }];
NSString *textStart = [text substringToIndex: unicharCount];
NSLog(@"textStart: '%@'", textStart);

textStart: 'qqq😄'

An alternative approach is to use utf32 encoding:

int byteCount = maxCharacters*4; // 4 utf32 characters
char buffer[byteCount];
NSUInteger usedBufferCount;
[text getBytes:buffer maxLength:byteCount usedLength:&usedBufferCount encoding:NSUTF32StringEncoding options:0 range:NSMakeRange(0, text.length) remainingRange:NULL];
NSString * textStart = [[NSString alloc] initWithBytes:buffer length:usedBufferCount encoding:NSUTF32LittleEndianStringEncoding];

There is some rational for this in Session 128 - Advance Text Processing from 2011 WWDC.




回答2:


This is what i did to cut a string with emoji characters

+(NSUInteger)unicodeLength:(NSString*)string{
    return [string lengthOfBytesUsingEncoding:NSUTF32StringEncoding]/4;
}

+(NSString*)unicodeString:(NSString*)string toLenght:(NSUInteger)len{

    if (len >= string.length){
        return string;
    }

    NSInteger charposition = 0;
    for (int i = 0; i < len; i++){
        NSInteger remainingChars = string.length-charposition;
        if (remainingChars >= 2){
            NSString* s = [string substringWithRange:NSMakeRange(charposition,2)];
            if ([self unicodeLength:s] == 1){
                charposition++;
            }
        }
        charposition++;
    }
    return [string substringToIndex:charposition];
}


来源:https://stackoverflow.com/questions/23788938/nsstring-to-treat-regular-english-alphabets-and-characters-like-emoji-or-japan

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!