convert unicode to html entities hex

前端 未结 4 2095
日久生厌
日久生厌 2020-12-03 19:27

How to convert a Unicode string to HTML entities? (HEX not decimal)

For example, convert Français to Français.

4条回答
  •  伪装坚强ぢ
    2020-12-03 19:54

    For the missing hex-encoding in the related question:

    $output = preg_replace_callback('/[\x{80}-\x{10FFFF}]/u', function ($match) {
        list($utf8) = $match;
        $binary = mb_convert_encoding($utf8, 'UTF-32BE', 'UTF-8');
        $entity = vsprintf('&#x%X;', unpack('N', $binary));
        return $entity;
    }, $input);
    

    This is similar to @Baba's answer using UTF-32BE and then unpack and vsprintf for the formatting needs.

    If you prefer iconv over mb_convert_encoding, it's similar:

    $output = preg_replace_callback('/[\x{80}-\x{10FFFF}]/u', function ($match) {
        list($utf8) = $match;
        $binary = iconv('UTF-8', 'UTF-32BE', $utf8);
        $entity = vsprintf('&#x%X;', unpack('N', $binary));
        return $entity;
    }, $input);
    

    I find this string manipulation a bit more clear then in Get hexcode of html entities.

提交回复
热议问题