It seems like MySQL does not support characters with more than 3 bytes in its default UTF-8 charset.
So, in PHP, how can I get rid of all 4(-and-more)-byte character
Here is my implementation to filter out 4-byte chars
$string = preg_replace_callback(
'/./u',
function (array $match) {
return strlen($match[0]) >= 4 ? null : $match[0];
},
$string
);
you could tweak it and replace null (which removes the char) with some substitute string. You can also replace >= 4 with some other byte-length check.