Digraph and trigraph can't work together?

拜拜、爱过 提交于 2020-08-25 07:05:02

问题


I'm learning digraph and trigraph, and here is the code which I cannot understand. (Yes, I admit that it's extremely ugly.)

This code can compile:

#define _(s) s%:%:s

main(_(_))
<%
    __;
%>t

This code can compile, too:

#define _(s) s??=??=s

main(_(_))
<%
    __;
%>

However, neither of the following two pieces of code can compile:

#define _(s) s%:??=s

main(_(_))
<%
    __;
%>

And

#define _(s) s??=%:s

main(_(_))
<%
    __;
%>

This does confuse me: Since the first two pieces of code can compile, I suppose the expansion of digraph and trigraph both take place before the macro expansion. So why can't it compile when digraph and trigraph are used together?


回答1:


Digraphs and trigraphs are totally different. Trigraphs are replaced during phase 1 of translation, [see Note 1] which is before the source code has been separated into tokens. Digraphs are tokens which are alternate spellings for other tokens, so they are not meaningful until after the source has been separated into tokens. (The word "digraph" is not very accurate; it is used because it resembles "trigraph", but the set of digraphs includes %:%: which consists of four characters.)

So ??= is replaced with a # before any token analysis is done. But %: is just a token, with the same meaning as #.

Moreover, %:%: is a token with the same meaning as ##. But %:# is two tokens (%: and #), which is not legal since the stringify operator (whether spelled %: or #) can only be followed by a macro parameter. [See Note 2] And it does not become any less illegal if the # were the result of a trigraph substitution.

One important difference between digraphs and trigraphs, as illustrated by the hilarious snippet in chqrlie's answer, is that trigraphs also work in strings. Digraphs allow you to write C code even if your keyboard lacks brackets and octothorpi, but they don't help you print those characters out.


Notes (Standards quotes):

  1. §5.1.1.2, Translation phases, paragraph 1:

    The precedence among the syntax rules of translation is specified by the following phases.

    1. Physical source file multibyte characters are mapped, in an implementation-defined manner, to the source character set (introducing new-line characters for end-of-line indicators) if necessary. Trigraph sequences are replaced by corresponding single-character internal representations.
  2. §6.10.3.2, The # operator, paragraph 1:

    Each # preprocessing token in the replacement list for a function-like macro shall be followed by a parameter as the next preprocessing token in the replacement list.




回答2:


For the academic side, look at rici's well documented answer.

For the common sense side, unless you are already quite proficient in C, digraphs and trigraphs are completely useless, and you should not even waste any time on the subject. They were invented as a way to support non-US 7-bit characters sets that were still used in the 1980s on mainframes and some minicomputers. These character sets lacked some of the punctuation needed for the C language, such as #, {, } etc. to make space for locale specific characters such as ç, é, è... (pardon my French).

Even on these systems, which I used for a long while, trigraphs were never used, because ugly pragmatic alternatives existed: on French systems, accented letters such as é and è were typed but would be interpreted by the C compiler as { and }. It made C programming obscure and pushed many programmers to switch to a US QWERTY keyboard and Locale (or equivalent).

This is a thing of the past, only of historical interest and you will never see these in action, aside from typos, obfuscation and obnoxious interview questions.

Regarding the latter, I cannot resist posting this one:

I cannot get fnmatch to validate my date template even if I force a valid date, what is wrong with this code:

#include <stdio.h>
#include <fnmatch.h>
int main() {
    char date[] = "01/01/1988";
    if (fnmatch("??/??/????", date, 0))
        printf("invalid date format\n");
    return 0;
}


来源:https://stackoverflow.com/questions/34583319/digraph-and-trigraph-cant-work-together

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!