C preprocessor macro doesn't parse comma separated tokens?

六月ゝ 毕业季﹏ 提交于 2020-01-24 20:20:07

问题


I want to choose one of two functions depending on the number of arguments:

  • nargs = 0 ----> f1
  • nargs > 0 ----> f2.

Macros do the following: get the first argument, then if no argument supplied ,it would add two commas ",NULL,NULL". Then it would select the second argument from the returned list of arguments.

for example:

  • f("Hello, world%i%s", x , s) ----> "Hello, world%i%s" ----> void
  • f() ----> ,NULL,NULL ----> NULL

so I can get null or void depending the on number of arguments.

Here is the macro:

#define FIRST(a, ...) a
#define COMMA_IF_PARENS(...) ,NULL,NULL
#define IS_EMPTY(...) COMMA_IF_PARENS __VA_ARGS__ ()
#define RM_FIRST(x, ...) __VA_ARGS__
#define CHOOSE(...) IS_EMPTY(FIRST(__VA_ARGS__)) 
#define ff_(...)() CHOOSE (__VA_ARGS__)
#define FF(...)()  ff_(__VA_ARGS__) ()
#define FX(...)  RM_FIRST (FF(__VA_ARGS__) ())
  • output for FF macro:

    • FF() ---->,((void*)0),((void*)0);
    • FF("Hello, world%i%s") ----> COMMA_IF_PARENS "Hello, world%i%s" ();
  • output for FX macro:

    • FX() ---> void
    • FX("Hello, world%i%s") ----> void
  • expected FX output:

    • FX() ----> NULL
    • FX("Hello, world%i%s") ----> void

The problem is that ,NULL,NULL which is returned from CHOOSE is treated as a single parameter!

Questions:

  1. Why does C preprocessor treat ,NULL,NULL as a single argument?
  2. How to make C preprocessor treat the result from CHOOSE as a list of arguments separated by comma rather than a single parameter?

NOTE:

  • I want to know why C preprocessor doesn't work as I am expecting.

回答1:


It sounds to me like you're carrying over intuitions from the C language itself back to the C preprocessor, and those intuitions are biting you because the CPP doesn't work the same way. Generically in C, functions take typed values as arguments. Expressions are not typed values; they get evaluated to give those things. So what you wind up with when you chain things is a type of inner-out evaluation; and this shapes your intuitions. For example, in evaluating f(g(h(),h()),m()), f is passed two arguments, but it can't do anything with g(h(),h()); that has to be evaluated, and the result is a value, and that's the argument passed to f. Say h returns 1, m returns 7, g returns a sum, and f a product. Then g evaluates on the values 1 and 1. f evaluates on the values 2 and 7. Most of C coding uses this language, and you get used to the idea that these inner expressions evaluate, and the resulting values get passed to the functions. But that's not how macros work.

In the weird world of macro invocations (phrased carefully; I'm intentionally ignoring conditional directives), your functions don't take typed values; they take token sequences. The CPP does match parentheses for you, meaning F(()) is an invocation of F with the argument (), as opposed to an invocation with the argument ( followed by a ) token. But in macro land, F(G(H(),H()),M()) invokes F with two arguments. Argument 1 is the token sequence G(H(),H()); and argument 2 is the token sequence M(). We don't evaluate the expression G to get a typed value, because there aren't typed values; there's only token sequences.

The steps of macro invocation for a function like macro begins with (6.10.3.1) argument substitution (a.s.). During a.s., the CPP looks first at the definition of the macro being called, and notes where the macro's parameters are mentioned in its replacement list. For any such mentions that are not being stringified, and not participating in a paste, the CPP evaluates the corresponding argument, and its evaluated result replaces these qualifying mentions of the parameter in the replacement list. Next, the CPP stringifies (6.10.3.2) and pastes (6.10.3.3) in no particular order. Once all of that is done, the resulting replacement list (6.10.3.4) undergoes rescan and further replacement (r.a.f.r) where it is, as the name suggests, rescanned for further replacements; during this rescanning the particular macro is temporarily disabled ("painted blue", as per 6.10.3.4p2).

So let's walk through this; I'll ignore the fact that you're using a language extension (gcc? clang?) where you're invoking a variadic macro with an insufficient number of arguments (you're not doing that intentionally anyway). We start with:

FX()

That invokes FX, with a single argument that is an empty token list (note that to the CPP, zero arguments only make sense if you define the macro with zero parameters; F() is called with an empty argument just as F(,) is called with two empty ones). So then a.s. happens, which transforms FX's replacement list from this... to this:

RM_FIRST (FF(__VA_ARGS__) ())  => RM_FIRST (FF() ())

Skipping stringification/pastes since there are none, we then do r.a.f.r. That recognizes RM_FIRST as a macro. RM_FIRST has one argument: FF() (). So we jump into a recursion level 2... invoking RM_FIRST.

That invocation of RM_FIRST itself begins with a.s. Assuming the variadic part is treated as empty, we have the parameter x associated with FF() (), but here's where your intuition really fails. x isn't mentioned in the replacement list, so nothing happens to FF() (). That's a.s. for you. Treating as per whatever extension applies __VA_ARGS__ as if it's empty, we just get this:

__VA_ARGS__ => 

...IOW, there's nothing there any more. So we're basically done.

I'm guessing you were C-function-intuiting this; in doing so, you were expecting FF() () to evaluate, and the result to be passed into RM_FIRST as an argument, but that's not how macros evaluate.

You can, however, get that to happen with indirection. If you did this instead:

#define RM_FIRST(...) RM_FIRST_I(__VA_ARGS__)
#define RM_FIRST_I(x,...) __VA_ARGS__

...and we go back to when RM_FIRST is invoked, we have a different story. Here, FF() () is part of your variadic list, and __VA_ARGS__ is mentioned. So at that a.s. step, we would get:

RM_FIRST_I(__VA_ARGS__) => RM_FIRST_I( () () ,NULL,NULL ())

(Just being literal... I'm guessing the extra litter is part of your diagnostic; but I'm pretty sure you know where to remove the redundant ()'s). Then, during r.a.f.r., we see RM_FIRST_I being invoked, and so the story goes.



来源:https://stackoverflow.com/questions/57123398/c-preprocessor-macro-doesnt-parse-comma-separated-tokens

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!