问题
I tried to understand the macros in c using the concatenation preprocessor operator ## but I realized that I have problem with tokens. I thought it was easy but in practice it is not.
So the concatenation is for concatenating two tokens to create a new token.
ex: concatenating (
and )
or int
and *
I tried
#define foo(x,y) x ## y
foo(x,y)
whenever I give it some arguments I get always error saying that pasting both argument does not give a valid preprocessor token.
For instance why concatenating foo(1,aa)
results in 1aa
(which type of token is it ? and why it is valid) but foo(int,*)
I got an error.
Is there a way to know which tokens are valid or is it possible to have some good link to understand how can clarify it in my mind. (I already googled in google and SO)
What am I missing ?
I will be grateful.
回答1:
Preprocessor token concatenation is for generating new tokens, but it is not capable of pasting arbitrary language constructs together (confer, for example, gcc documentation):
However, two tokens that don't together form a valid token cannot be pasted together. For example, you cannot concatenate x with + in either order.
So an attempt at a macro that makes a pointer out of a type like
#define MAKEPTR(NAME) NAME ## *
MAKEPTR(int) myIntPtr;
is invalid, as int*
are two tokens, not one.
The example of above mentioned link, however, shows the generation of new tokens:
#define COMMAND(NAME) { #NAME, NAME ## _command }
struct command commands[] =
{
COMMAND (quit),
COMMAND (help),
...
};
yields:
struct command commands[] =
{
{ "quit", quit_command },
{ "help", help_command },
...
};
Token quit_command
has not existed before but has been generated through token concatenation.
Note that a macro of the form
#define MAKEPTR(TYPE) TYPE*
MAKEPTR(int) myIntPtr;
is valid and actually generates a pointer type out of TYPE
, e.g. int*
out of int
.
回答2:
Since it seems to be a point of confusion, the string 1aa
is a valid preprocessor token; it is an instance of pp-number
, whose definition is (§6.4.8 of the current C standard):
pp-number:
digit
. digit
pp-number digit
pp-number identifier-nondigit
pp-number e sign
pp-number E sign
pp-number p sign
pp-number P sign
pp-number .
In other words, a pp-number
starts with a digit or a . followed by a digit, and after that it can contain any sequence of digits, "identifier-nondigits" (that is, letters, underscores, and other things which can be part of an identifier) or the letters e or p (either upper or lower-case) followed by a plus or minus sign.
That means that, for example, 0x1e+2
is a valid pp-number
, while 0x1f+1
is not (it is three tokens). In a valid program, every pp-number
which survives the preprocessing phases must satisfy the syntax of some numeric constant representation, which means that a program which includes the text 0x1e+2
will be considered invalid. The moral, if there is one, is that you should use whitespace generously; it has no cost.
The intention of pp-number
is to include everything which might eventually be a number in some future version of C. (Remember that numbers can be followed by alphabetic suffixes indicating types and signedness, such as 27LU
).
However, int*
is not a valid preprocessor token. It is two tokens (as is -3
) and so it cannot be formed with the token concatenation operator.
Another odd consequence of the token-pasting rule is that it is impossible to generate the valid token ...
through token concatenation, because ..
is not a valid token. (a##b##c
must be evaluated in some order, so even if all three preprocessor macros expand to ., there must be an attempt to create the token ..
, which will fail in must compilers, although I believe Visual Studio accepts it.)
Finally, comment symbols /*
and //
are not tokens; comments are replaced with whitespace before the separation of the program text into tokens. So you cannot produce a comment with token-pasting either (at least, not in a compliant compiler).
回答3:
Preprocessing token is defined by the C language grammar, see section 6.4 of the current standard:
preprocessing-token: header-name identifier pp-number character-constant string-literal punctuator each non-white-space character that cannot be one of the above
The meaning of each of those terms is defined elsewhere in the grammar. Most are self-explanatory; identifier
means anything that is a valid variable name (or would be if it wasn't a keyword), and pp-number
includes integer and floating point constants.
In Standard C, the result of pasting two preprocessing tokens must be another valid preprocessing token. Historically some preprocessors have allowed other pasting (which is equivalent to not pasting!) but this leads to confusion when people compile their code with a different compiler.
来源:https://stackoverflow.com/questions/41691220/valid-preprocessor-tokens-in-macro-concatenation