How to convert a Unicode string to HTML entities? (HEX
not decimal)
For example, convert Français
to Français
.
Firstly, when I faced this problem recently, I solved it by making sure my code-files, DB connection, and DB tables were all UTF-8 Then, simply echoing the text works. If you must escape the output from the DB use htmlspecialchars()
and not htmlentities()
so that the UTF-8 symbols are left alone and not attempted to be escaped.
Would like to document an alternative solution because it solved a similar problem for me.
I was using PHP's utf8_encode()
to escape 'special' characters.
I wanted to convert them into HTML entities for display, I wrote this code because I wanted to avoid iconv or such functions as far as possible since not all environments necessarily have them (do correct me if it is not so!)
$foo = 'This is my test string \u03b50';
echo unicode2html($foo);
function unicode2html($string) {
return preg_replace('/\\\\u([0-9a-z]{4})/', '$1;', $string);
}
Hope this helps somebody in need :-)