It seems like MySQL does not support characters with more than 3 bytes in its default UTF-8 charset.
So, in PHP, how can I get rid of all 4(-and-more)-byte character
Another filter implementation, more complex.
It try transliterate to ASCII characters, otherwise iserts unicode replacement character to avoid XSS, eg.:
$tr = preg_replace_callback('/([\x{10000}-\x{10FFFF}])/u', function($m){
$c = iconv('ISO-8859-2', 'UTF-8',iconv('utf-8','ISO-8859-2//TRANSLIT//IGNORE', $m[1]));
if($c == '')
return '�';
return $c;
}, $s);