My problem is to remove emoji from a string, but not CJK (Chinese, Japanese, Korean) characters from a string using regex. I tried to use this regex:
REGEX =
Karol S already provided a solution, but the reason might not be clear:
"\u1F600" is actually "\u1F60" followed by "0":
"\u1F600"
"\u1F60"
"0"
"\u1F60" # => "ὠ" "\u1F600" # => "ὠ0"
You have to use curly braces for code points above FFFF:
"\u{1F600}" #=> "