原文:http://golang.org/doc/go_spec.html
翻译:红猎人 (zengsai@gmail.com)
Source code representation 源代码表示
Source code is Unicode text encoded in UTF-8. The text is not canonicalized, so a single accented code point is distinct from the same character constructed from combining an accent and a letter; those are treated as two code points. For simplicity, this document will use the term character to refer to a Unicode code point.
源代码是用 UTF-8 编码 的 Unicode 文本。文本不是规范化的,因此一个单独加了重音的代码点有别于由字母和重音 结合而成的字符;它们对当作两个代码点对待。为了简单起见,该文档使用术语 字符 指代 Unicode 代码点。
Each code point is distinct; for instance, upper and lower case letters are different characters.
每个代码点都是不同的;如,大写字母和小字字母是不同的字符。
Implementation restriction: For compatibility with other tools, a compiler may disallow the NUL character (U+0000) in the source text.
执行限制: 为了与其它工具兼容, 编译器可能不允许在源代码中包含 NUL 字符 (U+0000) 。
字符
The following terms are used to denote specific Unicode character classes:
unicode_char = /* an arbitrary Unicode code point */ .
unicode_letter = /* a Unicode code point classified as "Letter" */ .
unicode_digit = /* a Unicode code point classified as "Digit" */ .
下面的术语用于表示指定的 Unicode 字符类:
unicode_char = /* 任意一个 Unicode 代码点 */ .
unicode_letter = /* 属于 "字母" 类的一个 Unicode 代码点 */ .
unicode_digit = /* 属于 "数字" 类的一个 Unicode 代码点 */ .
In The Unicode Standard 5.2, Section 4.5 General Category-Normative defines a set of character categories. Go treats those characters in category Lu, Ll, Lt, Lm, or Lo as Unicode letters, and those in category Nd as Unicode digits.
在 Unicode 标准 5.2 中, 第 4.5 节 General Category-Normative(一般分类规范)中定义了一组字符类别。Go 语言把 Lu, Ll, Lt, Lm 或 Lo 类中的字符当作 Unicode 字母, 把 Nd 类中的字符当作 Unicode 数字。
Letters and digits 字母和数字
The underscore character _
(U+005F) is considered a letter.
下划线 _
(U+005F) 被当作字母。
letter = unicode_letter | "_" .
decimal_digit = "0" ... "9" .
octal_digit = "0" ... "7" .
hex_digit = "0" ... "9" | "A" ... "F" | "a" ... "f" .
来源:oschina
链接:https://my.oschina.net/u/10896/blog/4665