json_encode() non utf-8 strings?

后端 未结 6 1191
名媛妹妹
名媛妹妹 2020-11-30 06:54

So I have an array of strings, and all of the strings are using the system default ANSI encoding and were pulled from a SQL database. So there are 256 diffe

6条回答
  •  忘掉有多难
    2020-11-30 06:55

    The JSON standard ENFORCES Unicode encoding. From RFC4627:

    3.  Encoding
    
       JSON text SHALL be encoded in Unicode.  The default encoding is
       UTF-8.
    
       Since the first two characters of a JSON text will always be ASCII
       characters [RFC0020], it is possible to determine whether an octet
       stream is UTF-8, UTF-16 (BE or LE), or UTF-32 (BE or LE) by looking
       at the pattern of nulls in the first four octets.
    
               00 00 00 xx  UTF-32BE
               00 xx 00 xx  UTF-16BE
               xx 00 00 00  UTF-32LE
               xx 00 xx 00  UTF-16LE
               xx xx xx xx  UTF-8
    

    Therefore, on the strictest sense, ANSI encoded JSON wouldn't be valid JSON; this is why PHP enforces unicode encoding when using json_encode().

    As for "default ANSI", I'm pretty sure that your strings are encoded in Windows-1252. It is incorrectly referred to as ANSI.

提交回复
热议问题