问题

I need to find out whether a character in a string is an emoji.

For example, I have this character:

let string = \"😀\"
let character = Array(string)[0]

I need to find out if that character is an emoji.

回答1:

What I stumbled upon is the difference between characters, unicode scalars and glyphs.

For example, the glyph 👨‍👨‍👧‍👧 consists of 7 unicode scalars:

Four emoji characters: 👨👩👧👧
In between each emoji is a special character, which works like character glue; see the specs for more info

Another example, the glyph 👌🏿 consists of 2 unicode scalars:

The regular emoji: 👌
A skin tone modifier: 🏿

Last one, the glyph 1️⃣ contains three unicode characters:

The digit one: 1
The variation selector
The Combining Enclosing Keycap:⃣

So when rendering the characters, the resulting glyphs really matter.

Swift 5.0 and above makes this process much easier and gets rid of some guesswork we needed to do. Unicode.Scalar's new Property type helps is determine what we're dealing with. However, those properties only make sense when checking the other scalars within the glyph. This is why we'll be adding some convenience methods to the Character class to help us out.

For more detail, I wrote an article explaining how this works.

For Swift 5.0, it leaves you with the following result:

extension Character {
    /// A simple emoji is one scalar and presented to the user as an Emoji
    var isSimpleEmoji: Bool {
        guard let firstProperties = unicodeScalars.first?.properties else {
            return false
        }
        return unicodeScalars.count == 1 &&
            (firstProperties.isEmojiPresentation ||
                firstProperties.generalCategory == .otherSymbol)
    }

    /// Checks if the scalars will be merged into an emoji
    var isCombinedIntoEmoji: Bool {
        return unicodeScalars.count > 1 &&
            unicodeScalars.contains { $0.properties.isJoinControl || $0.properties.isVariationSelector }
    }

    var isEmoji: Bool {
        return isSimpleEmoji || isCombinedIntoEmoji
    }
}

extension String {
    var isSingleEmoji: Bool {
        return count == 1 && containsEmoji
    }

    var containsEmoji: Bool {
        return contains { $0.isEmoji }
    }

    var containsOnlyEmoji: Bool {
        return !isEmpty && !contains { !$0.isEmoji }
    }

    var emojiString: String {
        return emojis.map { String($0) }.reduce("", +)
    }

    var emojis: [Character] {
        return filter { $0.isEmoji }
    }

    var emojiScalars: [UnicodeScalar] {
        return filter{ $0.isEmoji }.flatMap { $0.unicodeScalars }
    }
}

Which will give you the following results:

"A̛͚̖".containsEmoji // false
"3".containsEmoji // false
"A̛͚̖▶️".unicodeScalars // [65, 795, 858, 790, 9654, 65039]
"A̛͚̖▶️".emojiScalars // [9654, 65039]
"3️⃣".isSingleEmoji // true
"3️⃣".emojiScalars // [51, 65039, 8419]
"👌🏿".isSingleEmoji // true
"🙎🏼‍♂️".isSingleEmoji // true
"👨‍👩‍👧‍👧".isSingleEmoji // true
"👨‍👩‍👧‍👧".containsOnlyEmoji // true
"Hello 👨‍👩‍👧‍👧".containsOnlyEmoji // false
"Hello 👨‍👩‍👧‍👧".containsEmoji // true
"👫 Héllo 👨‍👩‍👧‍👧".emojiString // "👫👨‍👩‍👧‍👧"
"👨‍👩‍👧‍👧".count // 1

"👫 Héllœ 👨‍👩‍👧‍👧".emojiScalars // [128107, 128104, 8205, 128105, 8205, 128103, 8205, 128103]
"👫 Héllœ 👨‍👩‍👧‍👧".emojis // ["👫", "👨‍👩‍👧‍👧"]
"👫 Héllœ 👨‍👩‍👧‍👧".emojis.count // 2

"👫👨‍👩‍👧‍👧👨‍👨‍👦".isSingleEmoji // false
"👫👨‍👩‍👧‍👧👨‍👨‍👦".containsOnlyEmoji // true

For older Swift versions, check out this gist containing my old code.

回答2:

The simplest, cleanest, and swiftiest way to accomplish this is to simply check the Unicode code points for each character in the string against known emoji and dingbats ranges, like so:

extension String {

    var containsEmoji: Bool {
        for scalar in unicodeScalars {
            switch scalar.value {
            case 0x1F600...0x1F64F, // Emoticons
                 0x1F300...0x1F5FF, // Misc Symbols and Pictographs
                 0x1F680...0x1F6FF, // Transport and Map
                 0x2600...0x26FF,   // Misc symbols
                 0x2700...0x27BF,   // Dingbats
                 0xFE00...0xFE0F,   // Variation Selectors
                 0x1F900...0x1F9FF, // Supplemental Symbols and Pictographs
                 0x1F1E6...0x1F1FF: // Flags
                return true
            default:
                continue
            }
        }
        return false
    }

}

回答3:

extension String {
    func containsEmoji() -> Bool {
        for scalar in unicodeScalars {
            switch scalar.value {
            case 0x3030, 0x00AE, 0x00A9,// Special Characters
            0x1D000...0x1F77F,          // Emoticons
            0x2100...0x27BF,            // Misc symbols and Dingbats
            0xFE00...0xFE0F,            // Variation Selectors
            0x1F900...0x1F9FF:          // Supplemental Symbols and Pictographs
                return true
            default:
                continue
            }
        }
        return false
    }
}

This is my fix, with updated ranges.

回答4:

Swift 5.0

… introduced a new way of checking exactly this!

You have to break your String into its Scalars. Each Scalar has a Property value which supports the isEmoji value!

Actually you can even check if the Scalar is a Emoji modifier or more. Check out Apple's documentation: https://developer.apple.com/documentation/swift/unicode/scalar/properties

You may want to consider checking for isEmojiPresentation instead of isEmoji, because Apple states the following for isEmoji:

This property is true for scalars that are rendered as emoji by default and also for scalars that have a non-default emoji rendering when followed by U+FE0F VARIATION SELECTOR-16. This includes some scalars that are not typically considered to be emoji.

This way actually splits up Emoji's into all the modifiers, but it is way simpler to handle. And as Swift now counts Emoji's with modifiers (e.g.: 👨‍👩‍👧‍👦, 👨🏻‍💻, 🏴) as 1 you can do all kind of stuff.

var string = "🤓 test"

for scalar in string.unicodeScalars {
    let isEmoji = scalar.properties.isEmoji

    print("\(scalar.description) \(isEmoji)"))
}

// 🤓 true
//   false
// t false
// e false
// s false
// t false

NSHipster points out an interesting way to get all Emoji's:

import Foundation

var emoji = CharacterSet()

for codePoint in 0x0000...0x1F0000 {
    guard let scalarValue = Unicode.Scalar(codePoint) else {
        continue
    }

    // Implemented in Swift 5 (SE-0221)
    // https://github.com/apple/swift-evolution/blob/master/proposals/0221-character-properties.md
    if scalarValue.properties.isEmoji {
        emoji.insert(scalarValue)
    }
}

回答5:

Swift 3 Note:

It appears the cnui_containsEmojiCharacters method has either been removed or moved to a different dynamic library. _containsEmoji should still work though.

let str: NSString = "hello😊"

@objc protocol NSStringPrivate {
    func _containsEmoji() -> ObjCBool
}

let strPrivate = unsafeBitCast(str, to: NSStringPrivate.self)
strPrivate._containsEmoji() // true
str.value(forKey: "_containsEmoji") // 1


let swiftStr = "hello😊"
(swiftStr as AnyObject).value(forKey: "_containsEmoji") // 1

Swift 2.x:

I recently discovered a private API on NSString which exposes functionality for detecting if a string contains an Emoji character:

let str: NSString = "hello😊"

With an objc protocol and unsafeBitCast:

@objc protocol NSStringPrivate {
    func cnui_containsEmojiCharacters() -> ObjCBool
    func _containsEmoji() -> ObjCBool
}

let strPrivate = unsafeBitCast(str, NSStringPrivate.self)
strPrivate.cnui_containsEmojiCharacters() // true
strPrivate._containsEmoji() // true

With valueForKey:

str.valueForKey("cnui_containsEmojiCharacters") // 1
str.valueForKey("_containsEmoji") // 1

With a pure Swift string, you must cast the string as AnyObject before using valueForKey:

let str = "hello😊"

(str as AnyObject).valueForKey("cnui_containsEmojiCharacters") // 1
(str as AnyObject).valueForKey("_containsEmoji") // 1

Methods found in the NSString header file.

回答6:

You can use this code example or this pod.

To use it in Swift, import the category into the YourProject_Bridging_Header

#import "NSString+EMOEmoji.h"

Then you can check the range for every emoji in your String:

let example: NSString = "string👨‍👨‍👧‍👧with😍emojis✊🏿" //string with emojis

let containsEmoji: Bool = example.emo_containsEmoji()

    print(containsEmoji)

// Output: ["true"]

I created an small example project with the code above.

回答7:

For Swift 3.0.2, the following answer is the simplest one:

class func stringContainsEmoji (string : NSString) -> Bool
{
    var returnValue: Bool = false

    string.enumerateSubstrings(in: NSMakeRange(0, (string as NSString).length), options: NSString.EnumerationOptions.byComposedCharacterSequences) { (substring, substringRange, enclosingRange, stop) -> () in

        let objCString:NSString = NSString(string:substring!)
        let hs: unichar = objCString.character(at: 0)
        if 0xd800 <= hs && hs <= 0xdbff
        {
            if objCString.length > 1
            {
                let ls: unichar = objCString.character(at: 1)
                let step1: Int = Int((hs - 0xd800) * 0x400)
                let step2: Int = Int(ls - 0xdc00)
                let uc: Int = Int(step1 + step2 + 0x10000)

                if 0x1d000 <= uc && uc <= 0x1f77f
                {
                    returnValue = true
                }
            }
        }
        else if objCString.length > 1
        {
            let ls: unichar = objCString.character(at: 1)
            if ls == 0x20e3
            {
                returnValue = true
            }
        }
        else
        {
            if 0x2100 <= hs && hs <= 0x27ff
            {
                returnValue = true
            }
            else if 0x2b05 <= hs && hs <= 0x2b07
            {
                returnValue = true
            }
            else if 0x2934 <= hs && hs <= 0x2935
            {
                returnValue = true
            }
            else if 0x3297 <= hs && hs <= 0x3299
            {
                returnValue = true
            }
            else if hs == 0xa9 || hs == 0xae || hs == 0x303d || hs == 0x3030 || hs == 0x2b55 || hs == 0x2b1c || hs == 0x2b1b || hs == 0x2b50
            {
                returnValue = true
            }
        }
    }

    return returnValue;
}

回答8:

The absolutely similar answer to those that wrote before me, but with updated set of emoji scalars.

extension String {
    func isContainEmoji() -> Bool {
        let isContain = unicodeScalars.first(where: { $0.isEmoji }) != nil
        return isContain
    }
}


extension UnicodeScalar {

    var isEmoji: Bool {
        switch value {
        case 0x1F600...0x1F64F,
             0x1F300...0x1F5FF,
             0x1F680...0x1F6FF,
             0x1F1E6...0x1F1FF,
             0x2600...0x26FF,
             0x2700...0x27BF,
             0xFE00...0xFE0F,
             0x1F900...0x1F9FF,
             65024...65039,
             8400...8447,
             9100...9300,
             127000...127600:
            return true
        default:
            return false
        }
    }

}

回答9:

With Swift 5 you can now inspect the unicode properties of each character in your string. This gives us the convenient isEmoji variable on each letter. The problem is isEmoji will return true for any character that can be converted into a 2-byte emoji, such as 0-9.

We can look at the variable isEmoji and also check the for the presence of an emoji modifier to determine if the ambiguous characters will display as an emoji.

This solution should be much more future proof than the regex solutions offered here.

extension String {
    func containsOnlyEmojis() -> Bool {
        if count == 0 {
            return false
        }
        for character in self {
            if !character.isEmoji {
                return false
            }
        }
        return true
    }

    func containsEmoji() -> Bool {
        for character in self {
            if character.isEmoji {
                return true
            }
        }
        return false
    }
}

extension Character {
    // An emoji can either be a 2 byte unicode character or a normal UTF8 character with an emoji modifier
    // appended as is the case with 3️⃣. 0x238C is the first instance of UTF16 emoji that requires no modifier.
    // `isEmoji` will evaluate to true for any character that can be turned into an emoji by adding a modifier
    // such as the digit "3". To avoid this we confirm that any character below 0x238C has an emoji modifier attached
    var isEmoji: Bool {
        guard let scalar = unicodeScalars.first else { return false }
        return scalar.properties.isEmoji && (scalar.value > 0x238C || unicodeScalars.count > 1)
    }
}

Giving us

"hey".containsEmoji() //false

"Hello World 😎".containsEmoji() //true
"Hello World 😎".containsOnlyEmojis() //false

"😎".containsEmoji() //true
"😎".containsOnlyEmojis() //true

回答10:

You can use NSString-RemoveEmoji like this:

if string.isIncludingEmoji {

}

回答11:

Future Proof: Manually check the character's pixels; the other solutions will break (and have broken) as new emojis are added.

Note: This is Objective-C (can be converted to Swift)

Over the years these emoji-detecting solutions keep breaking as Apple adds new emojis w/ new methods (like skin-toned emojis built by pre-cursing a character with an additional character), etc.

I finally broke down and just wrote the following method which works for all current emojis and should work for all future emojis.

The solution creates a UILabel with the character and a black background. CG then takes a snapshot of the label and I scan all pixels in the snapshot for any non solid-black pixels. The reason I add the black background is to avoid issues of false-coloring due to Subpixel Rendering

The solution runs VERY fast on my device, I can check hundreds of characters a second, but it should be noted that this is a CoreGraphics solution and should not be used heavily like you could with a regular text method. Graphics processing is data heavy so checking thousands of characters at once could result in noticeable lag.

-(BOOL)isEmoji:(NSString *)character {

    UILabel *characterRender = [[UILabel alloc] initWithFrame:CGRectMake(0, 0, 1, 1)];
    characterRender.text = character;
    characterRender.font = [UIFont fontWithName:@"AppleColorEmoji" size:12.0f];//Note: Size 12 font is likely not crucial for this and the detector will probably still work at an even smaller font size, so if you needed to speed this checker up for serious performance you may test lowering this to a font size like 6.0
    characterRender.backgroundColor = [UIColor blackColor];//needed to remove subpixel rendering colors
    [characterRender sizeToFit];

    CGRect rect = [characterRender bounds];
    UIGraphicsBeginImageContextWithOptions(rect.size,YES,0.0f);
    CGContextRef contextSnap = UIGraphicsGetCurrentContext();
    [characterRender.layer renderInContext:contextSnap];
    UIImage *capturedImage = UIGraphicsGetImageFromCurrentImageContext();
    UIGraphicsEndImageContext();

    CGImageRef imageRef = [capturedImage CGImage];
    NSUInteger width = CGImageGetWidth(imageRef);
    NSUInteger height = CGImageGetHeight(imageRef);
    CGColorSpaceRef colorSpace = CGColorSpaceCreateDeviceRGB();
    unsigned char *rawData = (unsigned char*) calloc(height * width * 4, sizeof(unsigned char));
    NSUInteger bytesPerPixel = 4;//Note: Alpha Channel not really needed, if you need to speed this up for serious performance you can refactor this pixel scanner to just RGB
    NSUInteger bytesPerRow = bytesPerPixel * width;
    NSUInteger bitsPerComponent = 8;
    CGContextRef context = CGBitmapContextCreate(rawData, width, height,
                                                 bitsPerComponent, bytesPerRow, colorSpace,
                                                 kCGImageAlphaPremultipliedLast | kCGBitmapByteOrder32Big);
    CGColorSpaceRelease(colorSpace);

    CGContextDrawImage(context, CGRectMake(0, 0, width, height), imageRef);
    CGContextRelease(context);

    BOOL colorPixelFound = NO;

    int x = 0;
    int y = 0;
    while (y < height && !colorPixelFound) {
        while (x < width && !colorPixelFound) {

            NSUInteger byteIndex = (bytesPerRow * y) + x * bytesPerPixel;

            CGFloat red = (CGFloat)rawData[byteIndex];
            CGFloat green = (CGFloat)rawData[byteIndex+1];
            CGFloat blue = (CGFloat)rawData[byteIndex+2];

            CGFloat h, s, b, a;
            UIColor *c = [UIColor colorWithRed:red green:green blue:blue alpha:1.0f];
            [c getHue:&h saturation:&s brightness:&b alpha:&a];//Note: I wrote this method years ago, can't remember why I check HSB instead of just checking r,g,b==0; Upon further review this step might not be needed, but I haven't tested to confirm yet. 

            b /= 255.0f;

            if (b > 0) {
                colorPixelFound = YES;
            }

            x++;
        }
        x=0;
        y++;
    }

    return colorPixelFound;

}

回答12:

i had the same problem and ended up making a String and Character extensions.

The code is too long to post as it actually lists all emojis (from the official unicode list v5.0) in a CharacterSet you can find it here:

https://github.com/piterwilson/StringEmoji

Constants

let emojiCharacterSet: CharacterSet

Character set containing all known emoji (as described in official Unicode List 5.0 http://unicode.org/emoji/charts-5.0/emoji-list.html)

String

var isEmoji: Bool { get }

Whether or not the String instance represents a known single Emoji character

print("".isEmoji) // false
print("😁".isEmoji) // true
print("😁😜".isEmoji) // false (String is not a single Emoji)

var containsEmoji: Bool { get }

Whether or not the String instance contains a known Emoji character

print("".containsEmoji) // false
print("😁".containsEmoji) // true
print("😁😜".containsEmoji) // true

var unicodeName: String { get }

Applies a kCFStringTransformToUnicodeName - CFStringTransform on a copy of the String

print("á".unicodeName) // \N{LATIN SMALL LETTER A WITH ACUTE}
print("😜".unicodeName) // "\N{FACE WITH STUCK-OUT TONGUE AND WINKING EYE}"

var niceUnicodeName: String { get }

Returns the result of a kCFStringTransformToUnicodeName - CFStringTransform with \N{ prefixes and } suffixes removed

print("á".unicodeName) // LATIN SMALL LETTER A WITH ACUTE
print("😜".unicodeName) // FACE WITH STUCK-OUT TONGUE AND WINKING EYE

Character