PHP : writing a simple removeEmoji function

一世执手 提交于 2019-11-26 15:12:08
sglessard

I think the preg_replace function is the simpliest solution.

As EaterOfCode suggests, I read the wiki page and coded new regex since none of SO (or other websites) answers seemed to work for Instagram photo captions (API returning format) . Note: /u identifier is mandatory to match \x unicode chars.

public static function removeEmoji($text) {

    $clean_text = "";

    // Match Emoticons
    $regexEmoticons = '/[\x{1F600}-\x{1F64F}]/u';
    $clean_text = preg_replace($regexEmoticons, '', $text);

    // Match Miscellaneous Symbols and Pictographs
    $regexSymbols = '/[\x{1F300}-\x{1F5FF}]/u';
    $clean_text = preg_replace($regexSymbols, '', $clean_text);

    // Match Transport And Map Symbols
    $regexTransport = '/[\x{1F680}-\x{1F6FF}]/u';
    $clean_text = preg_replace($regexTransport, '', $clean_text);

    // Match Miscellaneous Symbols
    $regexMisc = '/[\x{2600}-\x{26FF}]/u';
    $clean_text = preg_replace($regexMisc, '', $clean_text);

    // Match Dingbats
    $regexDingbats = '/[\x{2700}-\x{27BF}]/u';
    $clean_text = preg_replace($regexDingbats, '', $clean_text);

    return $clean_text;
}

The function does not remove all emojis since there are many more, but you get the point.

Please refer to unicode.org - full emoji list (thanks Epoc)

As apple continues to add emojis to new versions of ios, i will be updating and maintaining this answer.

This answer has been updated for ios 12.1. If you have problems, then please check the edit history for previous versions of this answer (having multiple regex in this answer exceeds SO's max post body length)

Beta Version for ios 12.1 (Nov, 2018)

public static function removeEmoji($string)
    return preg_replace('/[\x{1F3F4}](?:\x{E0067}\x{E0062}\x{E0077}\x{E006C}\x{E0073}\x{E007F})|[\x{1F3F4}](?:\x{E0067}\x{E0062}\x{E0073}\x{E0063}\x{E0074}\x{E007F})|[\x{1F3F4}](?:\x{E0067}\x{E0062}\x{E0065}\x{E006E}\x{E0067}\x{E007F})|[\x{1F3F4}](?:\x{200D}\x{2620}\x{FE0F})|[\x{1F3F3}](?:\x{FE0F}\x{200D}\x{1F308})|[\x{0023}\x{002A}\x{0030}\x{0031}\x{0032}\x{0033}\x{0034}\x{0035}\x{0036}\x{0037}\x{0038}\x{0039}](?:\x{FE0F}\x{20E3})|[\x{1F415}](?:\x{200D}\x{1F9BA})|[\x{1F468}\x{1F469}](?:\x{200D}\x{1F467}\x{200D}\x{1F467})|[\x{1F468}\x{1F469}](?:\x{200D}\x{1F467}\x{200D}\x{1F466})|[\x{1F468}\x{1F469}](?:\x{200D}\x{1F467})|[\x{1F468}\x{1F469}](?:\x{200D}\x{1F466}\x{200D}\x{1F466})|[\x{1F468}\x{1F469}](?:\x{200D}\x{1F466})|[\x{1F468}](?:\x{200D}\x{1F468}\x{200D}\x{1F467}\x{200D}\x{1F467})|[\x{1F468}](?:\x{200D}\x{1F468}\x{200D}\x{1F466}\x{200D}\x{1F466})|[\x{1F468}](?:\x{200D}\x{1F468}\x{200D}\x{1F467}\x{200D}\x{1F466})|[\x{1F468}](?:\x{200D}\x{1F468}\x{200D}\x{1F467})|[\x{1F468}](?:\x{200D}\x{1F468}\x{200D}\x{1F466})|[\x{1F468}\x{1F469}](?:\x{200D}\x{1F469}\x{200D}\x{1F467}\x{200D}\x{1F467})|[\x{1F468}\x{1F469}](?:\x{200D}\x{1F469}\x{200D}\x{1F466}\x{200D}\x{1F466})|[\x{1F468}\x{1F469}](?:\x{200D}\x{1F469}\x{200D}\x{1F467}\x{200D}\x{1F466})|[\x{1F468}\x{1F469}](?:\x{200D}\x{1F469}\x{200D}\x{1F467})|[\x{1F468}\x{1F469}](?:\x{200D}\x{1F469}\x{200D}\x{1F466})|[\x{1F469}](?:\x{200D}\x{2764}\x{FE0F}\x{200D}\x{1F469})|[\x{1F469}\x{1F468}](?:\x{200D}\x{2764}\x{FE0F}\x{200D}\x{1F468})|[\x{1F469}](?:\x{200D}\x{2764}\x{FE0F}\x{200D}\x{1F48B}\x{200D}\x{1F469})|[\x{1F469}\x{1F468}](?:\x{200D}\x{2764}\x{FE0F}\x{200D}\x{1F48B}\x{200D}\x{1F468})|[\x{1F468}\x{1F469}](?:\x{200D}\x{1F9BD})|[\x{1F468}\x{1F469}](?:\x{200D}\x{1F9BC})|[\x{1F468}\x{1F469}](?:\x{200D}\x{1F9AF})|[\x{1F575}\x{1F3CC}\x{26F9}\x{1F3CB}](?:\x{FE0F}\x{200D}\x{2640}\x{FE0F})|[\x{1F575}\x{1F3CC}\x{26F9}\x{1F3CB}](?:\x{FE0F}\x{200D}\x{2642}\x{FE0F})|[\x{1F468}\x{1F469}](?:\x{200D}\x{1F692})|[\x{1F468}\x{1F469}](?:\x{200D}\x{1F680})|[\x{1F468}\x{1F469}](?:\x{200D}\x{2708}\x{FE0F})|[\x{1F468}\x{1F469}](?:\x{200D}\x{1F3A8})|[\x{1F468}\x{1F469}](?:\x{200D}\x{1F3A4})|[\x{1F468}\x{1F469}](?:\x{200D}\x{1F4BB})|[\x{1F468}\x{1F469}](?:\x{200D}\x{1F52C})|[\x{1F468}\x{1F469}](?:\x{200D}\x{1F4BC})|[\x{1F468}\x{1F469}](?:\x{200D}\x{1F3ED})|[\x{1F468}\x{1F469}](?:\x{200D}\x{1F527})|[\x{1F468}\x{1F469}](?:\x{200D}\x{1F373})|[\x{1F468}\x{1F469}](?:\x{200D}\x{1F33E})|[\x{1F468}\x{1F469}](?:\x{200D}\x{2696}\x{FE0F})|[\x{1F468}\x{1F469}](?:\x{200D}\x{1F3EB})|[\x{1F468}\x{1F469}](?:\x{200D}\x{1F393})|[\x{1F468}\x{1F469}](?:\x{200D}\x{2695}\x{FE0F})|[\x{1F471}\x{1F64D}\x{1F64E}\x{1F645}\x{1F646}\x{1F481}\x{1F64B}\x{1F9CF}\x{1F647}\x{1F926}\x{1F937}\x{1F46E}\x{1F482}\x{1F477}\x{1F473}\x{1F9B8}\x{1F9B9}\x{1F9D9}\x{1F9DA}\x{1F9DB}\x{1F9DC}\x{1F9DD}\x{1F9DE}\x{1F9DF}\x{1F486}\x{1F487}\x{1F6B6}\x{1F9CD}\x{1F9CE}\x{1F3C3}\x{1F46F}\x{1F9D6}\x{1F9D7}\x{1F3C4}\x{1F6A3}\x{1F3CA}\x{1F6B4}\x{1F6B5}\x{1F938}\x{1F93C}\x{1F93D}\x{1F93E}\x{1F939}\x{1F9D8}](?:\x{200D}\x{2640}\x{FE0F})|[\x{1F468}\x{1F469}](?:\x{200D}\x{1F9B2})|[\x{1F468}\x{1F469}](?:\x{200D}\x{1F9B3})|[\x{1F468}\x{1F469}](?:\x{200D}\x{1F9B1})|[\x{1F468}\x{1F469}](?:\x{200D}\x{1F9B0})|[\x{1F471}\x{1F64D}\x{1F64E}\x{1F645}\x{1F646}\x{1F481}\x{1F64B}\x{1F9CF}\x{1F647}\x{1F926}\x{1F937}\x{1F46E}\x{1F482}\x{1F477}\x{1F473}\x{1F9B8}\x{1F9B9}\x{1F9D9}\x{1F9DA}\x{1F9DB}\x{1F9DC}\x{1F9DD}\x{1F9DE}\x{1F9DF}\x{1F486}\x{1F487}\x{1F6B6}\x{1F9CD}\x{1F9CE}\x{1F3C3}\x{1F46F}\x{1F9D6}\x{1F9D7}\x{1F3C4}\x{1F6A3}\x{1F3CA}\x{1F6B4}\x{1F6B5}\x{1F938}\x{1F93C}\x{1F93D}\x{1F93E}\x{1F939}\x{1F9D8}](?:\x{200D}\x{2642}\x{FE0F})|[\x{1F441}](?:\x{FE0F}\x{200D}\x{1F5E8}\x{FE0F})|[\x{1F1E6}\x{1F1E7}\x{1F1E8}\x{1F1E9}\x{1F1F0}\x{1F1F2}\x{1F1F3}\x{1F1F8}\x{1F1F9}\x{1F1FA}](?:\x{1F1FF})|[\x{1F1E7}\x{1F1E8}\x{1F1EC}\x{1F1F0}\x{1F1F1}\x{1F1F2}\x{1F1F5}\x{1F1F8}\x{1F1FA}](?:\x{1F1FE})|[\x{1F1E6}\x{1F1E8}\x{1F1F2}\x{1F1F8}](?:\x{1F1FD})|[\x{1F1E6}\x{1F1E7}\x{1F1E8}\x{1F1EC}\x{1F1F0}\x{1F1F2}\x{1F1F5}\x{1F1F7}\x{1F1F9}\x{1F1FF}](?:\x{1F1FC})|[\x{1F1E7}\x{1F1E8}\x{1F1F1}\x{1F1F2}\x{1F1F8}\x{1F1F9}](?:\x{1F1FB})|[\x{1F1E6}\x{1F1E8}\x{1F1EA}\x{1F1EC}\x{1F1ED}\x{1F1F1}\x{1F1F2}\x{1F1F3}\x{1F1F7}\x{1F1FB}](?:\x{1F1FA})|[\x{1F1E6}\x{1F1E7}\x{1F1EA}\x{1F1EC}\x{1F1ED}\x{1F1EE}\x{1F1F1}\x{1F1F2}\x{1F1F5}\x{1F1F8}\x{1F1F9}\x{1F1FE}](?:\x{1F1F9})|[\x{1F1E6}\x{1F1E7}\x{1F1EA}\x{1F1EC}\x{1F1EE}\x{1F1F1}\x{1F1F2}\x{1F1F5}\x{1F1F7}\x{1F1F8}\x{1F1FA}\x{1F1FC}](?:\x{1F1F8})|[\x{1F1E6}\x{1F1E7}\x{1F1E8}\x{1F1EA}\x{1F1EB}\x{1F1EC}\x{1F1ED}\x{1F1EE}\x{1F1F0}\x{1F1F1}\x{1F1F2}\x{1F1F3}\x{1F1F5}\x{1F1F8}\x{1F1F9}](?:\x{1F1F7})|[\x{1F1E6}\x{1F1E7}\x{1F1EC}\x{1F1EE}\x{1F1F2}](?:\x{1F1F6})|[\x{1F1E8}\x{1F1EC}\x{1F1EF}\x{1F1F0}\x{1F1F2}\x{1F1F3}](?:\x{1F1F5})|[\x{1F1E6}\x{1F1E7}\x{1F1E8}\x{1F1E9}\x{1F1EB}\x{1F1EE}\x{1F1EF}\x{1F1F2}\x{1F1F3}\x{1F1F7}\x{1F1F8}\x{1F1F9}](?:\x{1F1F4})|[\x{1F1E7}\x{1F1E8}\x{1F1EC}\x{1F1ED}\x{1F1EE}\x{1F1F0}\x{1F1F2}\x{1F1F5}\x{1F1F8}\x{1F1F9}\x{1F1FA}\x{1F1FB}](?:\x{1F1F3})|[\x{1F1E6}\x{1F1E7}\x{1F1E8}\x{1F1E9}\x{1F1EB}\x{1F1EC}\x{1F1ED}\x{1F1EE}\x{1F1EF}\x{1F1F0}\x{1F1F2}\x{1F1F4}\x{1F1F5}\x{1F1F8}\x{1F1F9}\x{1F1FA}\x{1F1FF}](?:\x{1F1F2})|[\x{1F1E6}\x{1F1E7}\x{1F1E8}\x{1F1EC}\x{1F1EE}\x{1F1F2}\x{1F1F3}\x{1F1F5}\x{1F1F8}\x{1F1F9}](?:\x{1F1F1})|[\x{1F1E8}\x{1F1E9}\x{1F1EB}\x{1F1ED}\x{1F1F1}\x{1F1F2}\x{1F1F5}\x{1F1F8}\x{1F1F9}\x{1F1FD}](?:\x{1F1F0})|[\x{1F1E7}\x{1F1E9}\x{1F1EB}\x{1F1F8}\x{1F1F9}](?:\x{1F1EF})|[\x{1F1E6}\x{1F1E7}\x{1F1E8}\x{1F1EB}\x{1F1EC}\x{1F1F0}\x{1F1F1}\x{1F1F3}\x{1F1F8}\x{1F1FB}](?:\x{1F1EE})|[\x{1F1E7}\x{1F1E8}\x{1F1EA}\x{1F1EC}\x{1F1F0}\x{1F1F2}\x{1F1F5}\x{1F1F8}\x{1F1F9}](?:\x{1F1ED})|[\x{1F1E6}\x{1F1E7}\x{1F1E8}\x{1F1E9}\x{1F1EA}\x{1F1EC}\x{1F1F0}\x{1F1F2}\x{1F1F3}\x{1F1F5}\x{1F1F8}\x{1F1F9}\x{1F1FA}\x{1F1FB}](?:\x{1F1EC})|[\x{1F1E6}\x{1F1E7}\x{1F1E8}\x{1F1EC}\x{1F1F2}\x{1F1F3}\x{1F1F5}\x{1F1F9}\x{1F1FC}](?:\x{1F1EB})|[\x{1F1E6}\x{1F1E7}\x{1F1E9}\x{1F1EA}\x{1F1EC}\x{1F1EE}\x{1F1EF}\x{1F1F0}\x{1F1F2}\x{1F1F3}\x{1F1F5}\x{1F1F7}\x{1F1F8}\x{1F1FB}\x{1F1FE}](?:\x{1F1EA})|[\x{1F1E6}\x{1F1E7}\x{1F1E8}\x{1F1EC}\x{1F1EE}\x{1F1F2}\x{1F1F8}\x{1F1F9}](?:\x{1F1E9})|[\x{1F1E6}\x{1F1E8}\x{1F1EA}\x{1F1EE}\x{1F1F1}\x{1F1F2}\x{1F1F3}\x{1F1F8}\x{1F1F9}\x{1F1FB}](?:\x{1F1E8})|[\x{1F1E7}\x{1F1EC}\x{1F1F1}\x{1F1F8}](?:\x{1F1E7})|[\x{1F1E7}\x{1F1E8}\x{1F1EA}\x{1F1EC}\x{1F1F1}\x{1F1F2}\x{1F1F3}\x{1F1F5}\x{1F1F6}\x{1F1F8}\x{1F1F9}\x{1F1FA}\x{1F1FB}\x{1F1FF}](?:\x{1F1E6})|[\x{00A9}\x{00AE}\x{203C}\x{2049}\x{2122}\x{2139}\x{2194}-\x{2199}\x{21A9}-\x{21AA}\x{231A}-\x{231B}\x{2328}\x{23CF}\x{23E9}-\x{23F3}\x{23F8}-\x{23FA}\x{24C2}\x{25AA}-\x{25AB}\x{25B6}\x{25C0}\x{25FB}-\x{25FE}\x{2600}-\x{2604}\x{260E}\x{2611}\x{2614}-\x{2615}\x{2618}\x{261D}\x{2620}\x{2622}-\x{2623}\x{2626}\x{262A}\x{262E}-\x{262F}\x{2638}-\x{263A}\x{2640}\x{2642}\x{2648}-\x{2653}\x{265F}-\x{2660}\x{2663}\x{2665}-\x{2666}\x{2668}\x{267B}\x{267E}-\x{267F}\x{2692}-\x{2697}\x{2699}\x{269B}-\x{269C}\x{26A0}-\x{26A1}\x{26AA}-\x{26AB}\x{26B0}-\x{26B1}\x{26BD}-\x{26BE}\x{26C4}-\x{26C5}\x{26C8}\x{26CE}-\x{26CF}\x{26D1}\x{26D3}-\x{26D4}\x{26E9}-\x{26EA}\x{26F0}-\x{26F5}\x{26F7}-\x{26FA}\x{26FD}\x{2702}\x{2705}\x{2708}-\x{270D}\x{270F}\x{2712}\x{2714}\x{2716}\x{271D}\x{2721}\x{2728}\x{2733}-\x{2734}\x{2744}\x{2747}\x{274C}\x{274E}\x{2753}-\x{2755}\x{2757}\x{2763}-\x{2764}\x{2795}-\x{2797}\x{27A1}\x{27B0}\x{27BF}\x{2934}-\x{2935}\x{2B05}-\x{2B07}\x{2B1B}-\x{2B1C}\x{2B50}\x{2B55}\x{3030}\x{303D}\x{3297}\x{3299}\x{1F004}\x{1F0CF}\x{1F170}-\x{1F171}\x{1F17E}-\x{1F17F}\x{1F18E}\x{1F191}-\x{1F19A}\x{1F201}-\x{1F202}\x{1F21A}\x{1F22F}\x{1F232}-\x{1F23A}\x{1F250}-\x{1F251}\x{1F300}-\x{1F321}\x{1F324}-\x{1F393}\x{1F396}-\x{1F397}\x{1F399}-\x{1F39B}\x{1F39E}-\x{1F3F0}\x{1F3F3}-\x{1F3F5}\x{1F3F7}-\x{1F3FA}\x{1F400}-\x{1F4FD}\x{1F4FF}-\x{1F53D}\x{1F549}-\x{1F54E}\x{1F550}-\x{1F567}\x{1F56F}-\x{1F570}\x{1F573}-\x{1F57A}\x{1F587}\x{1F58A}-\x{1F58D}\x{1F590}\x{1F595}-\x{1F596}\x{1F5A4}-\x{1F5A5}\x{1F5A8}\x{1F5B1}-\x{1F5B2}\x{1F5BC}\x{1F5C2}-\x{1F5C4}\x{1F5D1}-\x{1F5D3}\x{1F5DC}-\x{1F5DE}\x{1F5E1}\x{1F5E3}\x{1F5E8}\x{1F5EF}\x{1F5F3}\x{1F5FA}-\x{1F64F}\x{1F680}-\x{1F6C5}\x{1F6CB}-\x{1F6D2}\x{1F6D5}\x{1F6E0}-\x{1F6E5}\x{1F6E9}\x{1F6EB}-\x{1F6EC}\x{1F6F0}\x{1F6F3}-\x{1F6FA}\x{1F7E0}-\x{1F7EB}\x{1F90D}-\x{1F93A}\x{1F93C}-\x{1F945}\x{1F947}-\x{1F971}\x{1F973}-\x{1F976}\x{1F97A}-\x{1F9A2}\x{1F9A5}-\x{1F9AA}\x{1F9AE}-\x{1F9CA}\x{1F9CD}-\x{1F9FF}\x{1FA70}-\x{1FA73}\x{1FA78}-\x{1FA7A}\x{1FA80}-\x{1FA82}\x{1FA90}-\x{1FA95}]/u', '', $string);
}

Updated the correct answer with more codes, just a few emojis are left.

public static function removeEmoji($text) {

    $clean_text = "";

    // Match Emoticons
    $regexEmoticons = '/[\x{1F600}-\x{1F64F}]/u';
    $clean_text = preg_replace($regexEmoticons, '', $text);

    // Match Miscellaneous Symbols and Pictographs
    $regexSymbols = '/[\x{1F300}-\x{1F5FF}]/u';
    $clean_text = preg_replace($regexSymbols, '', $clean_text);

    // Match Transport And Map Symbols
    $regexTransport = '/[\x{1F680}-\x{1F6FF}]/u';
    $clean_text = preg_replace($regexTransport, '', $clean_text);

    // Match Miscellaneous Symbols
    $regexMisc = '/[\x{2600}-\x{26FF}]/u';
    $clean_text = preg_replace($regexMisc, '', $clean_text);

    // Match Dingbats
    $regexDingbats = '/[\x{2700}-\x{27BF}]/u';
    $clean_text = preg_replace($regexDingbats, '', $clean_text);

    // Match Flags
    $regexDingbats = '/[\x{1F1E6}-\x{1F1FF}]/u';
    $clean_text = preg_replace($regexDingbats, '', $clean_text);

    // Others
    $regexDingbats = '/[\x{1F910}-\x{1F95E}]/u';
    $clean_text = preg_replace($regexDingbats, '', $clean_text);

    $regexDingbats = '/[\x{1F980}-\x{1F991}]/u';
    $clean_text = preg_replace($regexDingbats, '', $clean_text);

    $regexDingbats = '/[\x{1F9C0}]/u';
    $clean_text = preg_replace($regexDingbats, '', $clean_text);

    $regexDingbats = '/[\x{1F9F9}]/u';
    $clean_text = preg_replace($regexDingbats, '', $clean_text);

    return $clean_text;
}

I developed a funtcion using the parser from UTF-8 for ISO-8859-1 in php ( who returns a ? character for invalid characters in conversion ).

function removeEmojis( $string ) {
    $string = str_replace( "?", "{%}", $string );
    $string  = mb_convert_encoding( $string, "ISO-8859-1", "UTF-8" );
    $string  = mb_convert_encoding( $string, "UTF-8", "ISO-8859-1" );
    $string  = str_replace( array( "?", "? ", " ?" ), array(""), $string );
    $string  = str_replace( "{%}", "?", $string );
    return trim( $string );
}

Explanation:

  • convert the string from utf-8 to iso-8859-1

  • return back to utf-8 (mb_ function replace invalid characters to ''?''remove non-valid characters )

  • Replace ? to none

  • Return back the ''?'' character from the original string

Make sure you are using UTF-8 to work.

We had a really long fight with emojis at my work, we found a few regex for this problem but none of them worked. This one is working:

Edit: This does not cover ALL the emojis. I'm still searching for the Holy Grail of Emoji Regexp, but not found it yet.

return preg_replace('/([0-9|#][\x{20E3}])|[\x{00ae}\x{00a9}\x{203C}\x{2047}\x{2048}\x{2049}\x{3030}\x{303D}\x{2139}\x{2122}\x{3297}\x{3299}][\x{FE00}-\x{FEFF}]?|[\x{2190}-\x{21FF}][\x{FE00}-\x{FEFF}]?|[\x{2300}-\x{23FF}][\x{FE00}-\x{FEFF}]?|[\x{2460}-\x{24FF}][\x{FE00}-\x{FEFF}]?|[\x{25A0}-\x{25FF}][\x{FE00}-\x{FEFF}]?|[\x{2600}-\x{27BF}][\x{FE00}-\x{FEFF}]?|[\x{2900}-\x{297F}][\x{FE00}-\x{FEFF}]?|[\x{2B00}-\x{2BF0}][\x{FE00}-\x{FEFF}]?|[\x{1F000}-\x{1F6FF}][\x{FE00}-\x{FEFF}]?/u', '', $text);

Since Emoji characters use the private use area of unicode, you could use preg_replace() to remove that entire region of encoded characters from U+E000 to U+F8FF.

function removeEmoji($string) {
    return preg_replace('/&#x(e[0-9a-f][0-9a-f][0-9a-f]|f[0-8][0-9a-f][0-9a-f])/i', '', $string);
}
function emojiFilter($text){
$text = json_encode($text);
preg_match_all("/(\\\\ud83c\\\\u[0-9a-f]{4})|(\\\\ud83d\\\u[0-9a-f]{4})|(\\\\u[0-9a-f]{4})/", $text, $matchs);
if(!isset($matchs[0][0])) { return json_decode($text, true); }

$emoji = $matchs[0];
foreach($emoji as $ec) {
    $hex = substr($ec, -4);
    if(strlen($ec)==6) {
        if($hex>='2600' and $hex<='27ff') {
            $text = str_replace($ec, '', $text);
        }
    } else {
        if($hex>='dc00' and $hex<='dfff') {
            $text = str_replace($ec, '', $text);
        }
    }
}

return json_decode($text, true);  }

I have solved this issue by using the same code WordPress uses to replace emojis by images

here is the code that I used and it worked perfectly as it has a comprehensive list of the most used emojis

The full code exists here https://pastebin.com/8MqGdD6p

here is how it works but make sure to copy the code from pastebin as this is the non-complete code

$content ='<span class="do">⚫</span> where emojis exist';
$partials = array('&#x1f469;&#x200d;); // the list of emojis 

foreach ( $partials as $emojum ) {
    if ( version_compare( phpversion(), '5.4', '<' ) ) {
        $emoji_char = html_entity_decode( $emojum, ENT_COMPAT, 'UTF-8' );
    } else {
        $emoji_char = html_entity_decode( $emojum );
    }
    if ( false !== strpos( $content, $emoji_char ) ) {
        $content = preg_replace( "/$emoji_char/", '', $content );
    }
}

@sglessard since the code is outdated, here the full list of all Emoji for 07/12/2018 You will be able to generate it, by running the source code i posted

Please let me know if you find any kind of issue, thank you.

public static function removeEmoji($text) {
    $regexEmoticons = [
        '/[\x{0023}]/u',
        '/[\x{002A}]/u',
        '/[\x{00A9}]/u',
        '/[\x{00AE}]/u',
        '/[\x{200D}]/u',
        '/[\x{203C}]/u',
        '/[\x{2049}]/u',
        '/[\x{20E3}]/u',
        '/[\x{2122}]/u',
        '/[\x{2139}]/u',
        '/[\x{2194}-\x{2199}]/u',
        '/[\x{21A9}-\x{21AA}]/u',
        '/[\x{231A}-\x{231B}]/u',
        '/[\x{2328}]/u',
        '/[\x{23CF}]/u',
        '/[\x{23E9}-\x{23F3}]/u',
        '/[\x{23F8}-\x{23FA}]/u',
        '/[\x{24C2}]/u',
        '/[\x{25AA}-\x{25AB}]/u',
        '/[\x{25B6}]/u',
        '/[\x{25C0}]/u',
        '/[\x{25FB}-\x{25FE}]/u',
        '/[\x{2600}-\x{2604}]/u',
        '/[\x{260E}]/u',
        '/[\x{2611}]/u',
        '/[\x{2614}-\x{2615}]/u',
        '/[\x{2618}]/u',
        '/[\x{261D}]/u',
        '/[\x{2620}]/u',
        '/[\x{2622}-\x{2623}]/u',
        '/[\x{2626}]/u',
        '/[\x{262A}]/u',
        '/[\x{262E}-\x{262F}]/u',
        '/[\x{2638}-\x{263A}]/u',
        '/[\x{2640}]/u',
        '/[\x{2642}]/u',
        '/[\x{2648}-\x{2653}]/u',
        '/[\x{265F}-\x{2660}]/u',
        '/[\x{2663}]/u',
        '/[\x{2665}-\x{2666}]/u',
        '/[\x{2668}]/u',
        '/[\x{267B}]/u',
        '/[\x{267E}-\x{267F}]/u',
        '/[\x{2692}-\x{2697}]/u',
        '/[\x{2699}]/u',
        '/[\x{269B}-\x{269C}]/u',
        '/[\x{26A0}-\x{26A1}]/u',
        '/[\x{26AA}-\x{26AB}]/u',
        '/[\x{26B0}-\x{26B1}]/u',
        '/[\x{26BD}-\x{26BE}]/u',
        '/[\x{26C4}-\x{26C5}]/u',
        '/[\x{26C8}]/u',
        '/[\x{26CE}-\x{26CF}]/u',
        '/[\x{26D1}]/u',
        '/[\x{26D3}-\x{26D4}]/u',
        '/[\x{26E9}-\x{26EA}]/u',
        '/[\x{26F0}-\x{26F5}]/u',
        '/[\x{26F7}-\x{26FA}]/u',
        '/[\x{26FD}]/u',
        '/[\x{2702}]/u',
        '/[\x{2705}]/u',
        '/[\x{2708}-\x{270D}]/u',
        '/[\x{270F}]/u',
        '/[\x{2712}]/u',
        '/[\x{2714}]/u',
        '/[\x{2716}]/u',
        '/[\x{271D}]/u',
        '/[\x{2721}]/u',
        '/[\x{2728}]/u',
        '/[\x{2733}-\x{2734}]/u',
        '/[\x{2744}]/u',
        '/[\x{2747}]/u',
        '/[\x{274C}]/u',
        '/[\x{274E}]/u',
        '/[\x{2753}-\x{2755}]/u',
        '/[\x{2757}]/u',
        '/[\x{2763}-\x{2764}]/u',
        '/[\x{2795}-\x{2797}]/u',
        '/[\x{27A1}]/u',
        '/[\x{27B0}]/u',
        '/[\x{27BF}]/u',
        '/[\x{2934}-\x{2935}]/u',
        '/[\x{2B05}-\x{2B07}]/u',
        '/[\x{2B1B}-\x{2B1C}]/u',
        '/[\x{2B50}]/u',
        '/[\x{2B55}]/u',
        '/[\x{3030}]/u',
        '/[\x{303D}]/u',
        '/[\x{3297}]/u',
        '/[\x{3299}]/u',
        '/[\x{FE0F}]/u',
        '/[\x{1F004}]/u',
        '/[\x{1F0CF}]/u',
        '/[\x{1F170}-\x{1F171}]/u',
        '/[\x{1F17E}-\x{1F17F}]/u',
        '/[\x{1F18E}]/u',
        '/[\x{1F191}-\x{1F19A}]/u',
        '/[\x{1F1E6}-\x{1F1FF}]/u',
        '/[\x{1F201}-\x{1F202}]/u',
        '/[\x{1F21A}]/u',
        '/[\x{1F22F}]/u',
        '/[\x{1F232}-\x{1F23A}]/u',
        '/[\x{1F250}-\x{1F251}]/u',
        '/[\x{1F300}-\x{1F321}]/u',
        '/[\x{1F324}-\x{1F393}]/u',
        '/[\x{1F396}-\x{1F397}]/u',
        '/[\x{1F399}-\x{1F39B}]/u',
        '/[\x{1F39E}-\x{1F3F0}]/u',
        '/[\x{1F3F3}-\x{1F3F5}]/u',
        '/[\x{1F3F7}-\x{1F3FA}]/u',
        '/[\x{1F400}-\x{1F4FD}]/u',
        '/[\x{1F4FF}-\x{1F53D}]/u',
        '/[\x{1F549}-\x{1F54E}]/u',
        '/[\x{1F550}-\x{1F567}]/u',
        '/[\x{1F56F}-\x{1F570}]/u',
        '/[\x{1F573}-\x{1F57A}]/u',
        '/[\x{1F587}]/u',
        '/[\x{1F58A}-\x{1F58D}]/u',
        '/[\x{1F590}]/u',
        '/[\x{1F595}-\x{1F596}]/u',
        '/[\x{1F5A4}-\x{1F5A5}]/u',
        '/[\x{1F5A8}]/u',
        '/[\x{1F5B1}-\x{1F5B2}]/u',
        '/[\x{1F5BC}]/u',
        '/[\x{1F5C2}-\x{1F5C4}]/u',
        '/[\x{1F5D1}-\x{1F5D3}]/u',
        '/[\x{1F5DC}-\x{1F5DE}]/u',
        '/[\x{1F5E1}]/u',
        '/[\x{1F5E3}]/u',
        '/[\x{1F5E8}]/u',
        '/[\x{1F5EF}]/u',
        '/[\x{1F5F3}]/u',
        '/[\x{1F5FA}-\x{1F64F}]/u',
        '/[\x{1F680}-\x{1F6C5}]/u',
        '/[\x{1F6CB}-\x{1F6D2}]/u',
        '/[\x{1F6E0}-\x{1F6E5}]/u',
        '/[\x{1F6E9}]/u',
        '/[\x{1F6EB}-\x{1F6EC}]/u',
        '/[\x{1F6F0}]/u',
        '/[\x{1F6F3}-\x{1F6F9}]/u',
        '/[\x{1F910}-\x{1F93A}]/u',
        '/[\x{1F93C}-\x{1F93E}]/u',
        '/[\x{1F940}-\x{1F945}]/u',
        '/[\x{1F947}-\x{1F970}]/u',
        '/[\x{1F973}-\x{1F976}]/u',
        '/[\x{1F97A}]/u',
        '/[\x{1F97C}-\x{1F9A2}]/u',
        '/[\x{1F9B0}-\x{1F9B9}]/u',
        '/[\x{1F9C0}-\x{1F9C2}]/u',
        '/[\x{1F9D0}-\x{1F9FF}]/u',
        '/[\x{E0062}-\x{E0063}]/u',
        '/[\x{E006C}]/u',
        '/[\x{E006E}]/u',
        '/[\x{E007F}]/u'
    ];

    return preg_replace($regexEmoticons, '', $text);
}

And here the code to generate it :

<?php

$emojisAsHex = [];
$emojisasAsDecHex = [];

preg_match_all(
    "/(?:>|\s)+(U\+)(?'emojis'[0-9ABCDEF]{4,5})(?:<|\s)+/",
    file_get_contents('http://unicode.org/emoji/charts/full-emoji-list.html'),
    $emojisAsHex
);

//flip it, to remove duplication
$emojisAsHex = array_flip(array_flip($emojisAsHex['emojis']));


foreach ($emojisAsHex as $emojiAsHex) {
    $emojisasAsDecHex[hexdec($emojiAsHex)] = $emojiAsHex;
}

ksort($emojisasAsDecHex);




$outputHexa = '';
$else = '';

$startI = key($emojisasAsDecHex);
$endI =max(array_keys($emojisasAsDecHex)) + 1;

for ($i = $startI; $i < $endI; $i++) {
    if (isset($emojisasAsDecHex[$i]) && isset($emojisasAsDecHex[(1 + $i)])) {

        $outputHexa .=  "'/[\x{" . $emojisasAsDecHex[$i] . '}';
        while (isset($emojisasAsDecHex[(1 + $i)])) {

            $i++;
        }

        $outputHexa .=  '-\x{' . $emojisasAsDecHex[$i] . "}]/u'," . PHP_EOL;
    } else if (isset($emojisasAsDecHex[$i])) {
        $outputHexa .= "'/[\x{" . $emojisasAsDecHex[$i] . "}]/u'," . PHP_EOL;
    }
}


var_dump($outputHexa);

You could just use str_replace().

$emojiArray = array("&0123","&0234",etc. for all emoji);
$strippedComment = str_replace($emojiArray,"",$originalComment);
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!