How would you create a string of all UTF-8 characters?

前端 未结 5 594
北海茫月
北海茫月 2021-01-02 20:11

There are many ways to represent the +1 million UTF-8 characters. Take the latin capital \"A\" with macron (Ā). This is unicode code point U+0100,

5条回答
  •  难免孤独
    2021-01-02 20:48

    You can leverage iconv (or a few other functions) to convert a code point number to a UTF-8 string:

    function unichr($i)
    {
        return iconv('UCS-4LE', 'UTF-8', pack('V', $i));
    }
    
    $codeunits = array();
    for ($i = 0; $i<0xD800; $i++)
        $codeunits[] = unichr($i);
    for ($i = 0xE000; $i<0xFFFF; $i++)
        $codeunits[] = unichr($i);
    $all = implode($codeunits);
    

    (I avoided the surrogate range 0xD800–0xDFFF as they aren't valid to put in UTF-8 themselves; that would be “CESU-8”.)

提交回复
热议问题