Can UTF-8 contain zero byte?

后端 未结 3 1273
日久生厌
日久生厌 2020-11-29 10:00

Can UTF-8 string contain zerobytes? I\'m going to send it over ascii plaintext protocol, should I encode it with something like base64?

3条回答
  •  孤街浪徒
    2020-11-29 10:39

    ASCII text is restricted to byte values between 0 and 127. UTF-8 text has no such restriction - text encoded with UTF-8 may have its high bit set. So it's not safe to send UTF-8 text over a channel that doesn't guarantee safe passage for that high bit.

    If you're forced to deal with an ASCII-only channel, Base-64 is a reasonable (though not particularly space-efficient) choice. Are you sure you're limited to 7-bit data, though? That's somewhat unusual in this day.

提交回复
热议问题