I was trying to take out all emoji chars out of a string (like a sanitizer). But I cannot find a complete set of emoji values.
What is the complete set of emoji char
If you only deal with English character and emoji character I think it is doable. First convert your string to UTF-16 characters, then check each characters whose value is bigger than 0x0xD800 (for emoji it is actually >=0xD836) should be emoji.
This is because "The Unicode standard permanently reserves the code point values between 0xD800 to 0xDFFF for UTF-16 encoding of the high and low surrogates" and of course English characters (and many other character won't fall in this range)
But because emoji code point starts from U1F300 their UFT-16 value actually fall in this range.
Check here for a quick reference for emoji UFT-16 value, if you don't bother to do it yourself.