I am using a utf8 charset mysql tables in a mysql 5.1 server, which does not support utf8mb4 encoding in tables. When inserting 4-byte encoded utf8 characters like \"
The following regular expression will replace 4-byte UTF-8 characters:
function replace4byte($string, $replacement = '') {
return preg_replace('%(?:
\xF0[\x90-\xBF][\x80-\xBF]{2} # planes 1-3
| [\xF1-\xF3][\x80-\xBF]{3} # planes 4-15
| \xF4[\x80-\x8F][\x80-\xBF]{2} # plane 16
)%xs', $replacement, $string);
}
var_dump(replace4byte('d'), replace4byte('d