It seems like MySQL does not support characters with more than 3 bytes in its default UTF-8 charset.
So, in PHP, how can I get rid of all 4(-and-more)-byte character
Came across this question when trying to solve my own issue (Facebook spits out certain emoticons as 4-byte characters, Amazon Mechanical Turk does not accept 4-byte characters).
I ended up using this, doesn't require mbstring extension:
function remove_4_byte($string) {
$char_array = preg_split('/(?3) {
$char_array[$x] = "";
}
}
return implode($char_array, "");
}