NSString - Convert to pure alphabet only (i.e. remove accents+punctuation)

前端 未结 13 1472
暖寄归人
暖寄归人 2020-12-02 15:49

I\'m trying to compare names without any punctuation, spaces, accents etc. At the moment I am doing the following:

-(NSString*) prepareString:(NSString*)a {
         


        
13条回答
  •  佛祖请我去吃肉
    2020-12-02 16:15

    To give a complete example by combining the answers from Luiz and Peter, adding a few lines, you get the code below.

    The code does the following:

    1. Creates a set of accepted characters
    2. Turn accented letters into normal letters
    3. Remove characters not in the set

    Objective-C

    // The input text
    NSString *text = @"BûvérÈ!@$&%^&(*^(_()-*/48";
    
    // Create set of accepted characters
    NSMutableCharacterSet *acceptedCharacters = [[NSMutableCharacterSet alloc] init];
    [acceptedCharacters formUnionWithCharacterSet:[NSCharacterSet letterCharacterSet]];
    [acceptedCharacters formUnionWithCharacterSet:[NSCharacterSet decimalDigitCharacterSet]];
    [acceptedCharacters addCharactersInString:@" _-.!"];
    
    // Turn accented letters into normal letters (optional)
    NSData *sanitizedData = [text dataUsingEncoding:NSASCIIStringEncoding allowLossyConversion:YES];
    NSString *sanitizedText = [NSString stringWithCString:[sanitizedData bytes] encoding:NSASCIIStringEncoding];
    
    // Remove characters not in the set
    NSString* output = [[sanitizedText componentsSeparatedByCharactersInSet:[acceptedCharacters invertedSet]] componentsJoinedByString:@""];
    

    Swift (2.2) example

    let text = "BûvérÈ!@$&%^&(*^(_()-*/48"
    
    // Create set of accepted characters
    let acceptedCharacters = NSMutableCharacterSet()
    acceptedCharacters.formUnionWithCharacterSet(NSCharacterSet.letterCharacterSet())
    acceptedCharacters.formUnionWithCharacterSet(NSCharacterSet.decimalDigitCharacterSet())
    acceptedCharacters.addCharactersInString(" _-.!")
    
    // Turn accented letters into normal letters (optional)
    let sanitizedData = text.dataUsingEncoding(NSASCIIStringEncoding, allowLossyConversion: true)
    let sanitizedText = String(data: sanitizedData!, encoding: NSASCIIStringEncoding)
    
    // Remove characters not in the set
    let components = sanitizedText!.componentsSeparatedByCharactersInSet(acceptedCharacters.invertedSet)
    let output = components.joinWithSeparator("")
    

    Output

    The output for both examples would be: BuverE!_-48

提交回复
热议问题