Which ASCII Characters are Obsolete?

﹥>﹥吖頭↗ 提交于 2020-01-01 19:22:16

问题


My understanding is that the ASCII characters found in the range from 0x00 to 0x1f were included with Teletype machines in mind. In the modern era, many of them have become obsolete. I was curious as to which characters might still be found in a conventional string or file. From my experience programming in C, I thought those might be NUL, LF, TAB, and maybe EOT. I'm especially curious about BS and ESC, as I thought (similar to shift or control maybe) that those might be handled by the OS and never really printed or be included in a string. Any amount of insight would be appreciated!

Table for reference:


回答1:


Out of the characters between hexadecimal 00 and 1F, the only ones you are likely to encounter frequently are NUL (0x00 = \0), TAB (0x09 = \t), CR (0x0D = \r), and LF (0x0A = \n). Of these, NUL is used in C-like languages as a string terminator, TAB is used as a tab character, and CR and LF are used at the end of a line. (Which one is used is a complicated situation; see the Wikipedia article Newline for details, including a history of how this came to be.)

The following additional characters are used when communicating with VT100-compatible terminal emulators, but are rarely found outside that context:

  • BEL (0x07 = \a), which causes a terminal to beep and/or flash.
  • BS (0x08 = \b), which is used to move the cursor left one position. (It is not sent when you press the backspace key; see below!)
  • SO and SI (0x0E and 0x0F), which are used to switch into certain special character sets.
  • ESC (0x1B = \e), which is sent when pressing the Escape key and various other function keys, and is additionally used to introduce escape sequences which control the terminal.
  • DEL (0x7F), which is sent when you press the backspace key.

The rest of the nonprintable ASCII characters are essentially unused.




回答2:


"Backspace composition no longer works with typical modern digital displays or typesetting systems" Ref Backspace

Here's a stackoverflow - The backspace escape character in c unexpected behavior

Ref Unicode

Unicode and the ISO/IEC 10646 Universal Character Set (UCS) have a much wider array of characters and their various encoding forms have begun to supplant ISO/IEC 8859 and ASCII rapidly in many environments. While ASCII is limited to 128 characters, Unicode and the UCS support more characters by separating the concepts of unique identification (using natural numbers called code points) and encoding (to 8-, 16- or 32-bit binary formats, called UTF-8, UTF-16 and UTF-32).

To allow backward compatibility, the 128 ASCII and 256 ISO-8859-1 (Latin 1) characters are assigned Unicode/UCS code points that are the same as their codes in the earlier standards. Therefore, ASCII can be considered a 7-bit encoding scheme for a very small subset of Unicode/UCS, and ASCII (when prefixed with 0 as the eighth bit) is valid UTF-8.

Here's another unicode using backspace stackoverflow what is the purpose of unicode backspace u0008

Here's a goood overview of stackoverflow c programming how to program for unicode and UTF-8

And Finally heres (FSF.org) GNU implementation GNU libunistring manual

"This library provides functions for manipulating Unicode strings and for manipulating C strings according to the Unicode standard."

All the best Hope this helps.



来源:https://stackoverflow.com/questions/37449461/which-ascii-characters-are-obsolete

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!