Are ASCII characters always encoded the same way in all character encodings?

最后都变了- 提交于 2019-12-12 10:08:42

问题


In ASCII, the character < is encoded as a single-byte character 0x3C, what I'd like to know is that is there a character set where < is encoded differently? I tried UTF-8, it's the same. I tried GB2312 and it's the same...

Another question, are all ASCII characters the same in all character sets?


回答1:


The first 127 characters of ASCII are the same in all ASCII-derived character sets. They are not the same in non-ASCII-character sets (such as EBCDIC).

Characters with codes > 127 are different depending on the codepage and/or the encoding.




回答2:


No, there are some unofficial regional variants of ISO-646 which differ quite a lot from ASCII.




回答3:


In UTF-16 'abc' is encoded as '0 97 0 98 0 99', which is very similar to ASCII, but if you try to interpret it as ASCII, you will end up with an extra NUL character before (or after, depending on endianness) each character. Not a huge difference, but enough to make them uninterchangable.



来源:https://stackoverflow.com/questions/1775206/are-ascii-characters-always-encoded-the-same-way-in-all-character-encodings

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!