Is (x++, y) + (y++, x) undefined or unspecified, and if unspecified, what can it compute?

三世轮回 提交于 2019-12-13 02:35:19

问题


The comma sequence operator introduces a sequence point in an expression. I am wondering whether this means that the program below avoids undefined behavior.

int x, y;

int main()
{
  return (x++, y) + (y++, x);
}

If it does avoid undefined behavior, it could still be unspecified, that is, return one of several possible values. I would think that in C99, it can only compute 1, but actually, various versions of GCC compile this program into an executable that returns 2. Clang generates an executable that returns 1, apparently agreeing with my intuition.

Lastly, is this something that changed in C11?


回答1:


Take the expression:

(x++, y) + (y++, x)

Evaluate left-to-right:

x++  // yield 0, schedule increment of x
,    // sequence point: x definitely incremented now
y    // yield 0
y++  // yield 0, schedule increment of y
// explode because you just read from y and wrote to y
// with no intervening sequence point

There's nothing in the standard that forbids this, so the whole thing has undefined behavior.

Contrast this pseudocode:

f() { return x++, y; }
g() { return y++, x; }
f() + g()

Acoording to C99 (5.1.2.3/2) the calls to f and g themselves count as side effects, and the function call operator contains a sequence point just before it enters a function. This means function executions can't interleave.

Under the "evaluate things in parallel" model:

f()  // arbitrarily start with f: sequence point; enter f
g()  // at the same time, start calling g: sequence point

Since the execution of f counts as a side effect itself, the sequence point in g() suspends execution until f has returned. Thus, no undefined behavior.




回答2:


The whole chapter 6.5 of the standard mentions the order of evaluation, on operator basis. The best summary of the order of evaluation I could found was the (non-normative) Annex J of the standard:

C11 Annex J

J.1 Unspecified behavior

  • The order in which subexpressions are evaluated and the order in which side effects take place, except as specified for the function-call (), &&, ||, ?:, and comma operators (6.5)

In your example, you cannot know whether the sub-expression (x++, y) or (y++, x) is evaluated first, since the order of evaluation of the operands of the + operator is unspecified.

As for undefined behavior, the comma operator solved nothing. If (x++, y) is evaluated first, then y may get evaluated immediately before y++ of the other sub-expression. Since y is accessed twice without a sequence point in between, for other purposes than to determine the value to store, the behavior is undefined. More info here.

So your program has undefined and unspecified behavior both.

(In addition, it has implementation-defined behavior since int main(), rather than int main (void) is not one of the well-defined versions of main in a hosted application.)




回答3:


My understanding is that any of the following evaluation orders are legal:

  • x++, y++, y, x
  • x++, y, y++, x
  • x++, y++, x, y
  • y++, x, x++, y
  • ...

But not:

  • y, x++, x, y++
  • ...

In my understanding, the only ordering that is required is that (x++) must come before (y), and (y++) must come before (x).

Therefore, I believe that this is still undefined behavior.




回答4:


The standard explicitly declares which operators introduce sequence points, + is not among them. So your expression can well be evaluated in a way that the forbidden subexpressions are evaluated "close" to each other, without a sequence point in between them.

On a pragmatical base, there is a reason for this interdiction, namely that the evaluation of the operands of such an operation may be scheduled in parallel, e.g if the processor has several pipelines for the operation of the subexpressions in question. You can easily convince yourself that under such a model of evaluation, anything can happen, the result is not predictable.

In your toy example, the conflict is predictable, but this wouldn't necessarily be the case if you would use some pointer indirection in the subexpressions. Then you wouldn't know if one of the subexpressions would potentially alias the other. So basically not much that C can do if it wants to allow for this type of optimization.



来源:https://stackoverflow.com/questions/13935904/is-x-y-y-x-undefined-or-unspecified-and-if-unspecified-what-can-it

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!