a = a++;
is undefined behaviour in C. The question I am asking is : why?
I mean, I get that it might be hard to provide a
With a few exceptions, the order in which expressions are evaluated is unspecified; this was a deliberate design decision, and it allows implementations to rearrange the evaluation order from what's written if that will result in more efficient machine code. Similarly, the order in which the side effects of ++
and --
are applied is unspecified beyond the requirement that it happen before the next sequence point, again to give implementations the freedom to arrange operations in an optimal manner.
Unfortunately, this means that the result of an expression like a = a++
will vary based on the compiler, compiler settings, surrounding code, etc. The behavior is specifically called out as undefined in the language standard so that compiler implementors don't have to worry about detecting such cases and issuing a diagnostic against them. Cases like a = a++
are obvious, but what about something like
void foo(int *a, int *b)
{
*a = (*b)++;
}
If that's the only function in the file (or if its caller is in a different file), there's no way to know at compile time whether a
and b
point to the same object; what do you do?
Note that it's entirely possible to mandate that all expressions be evaluated in a specific order, and that all side effects be applied at a specific point in evaluation; that's what Java and C# do, and in those languages expressions like a = a++
are always well-defined.
It's undefined because there is no good reason for writing code like that, and by not requiring any specific behaviour for bogus code, compilers can more aggressively optimize well-written code. For example, *p = i++
may be optimized in a way that causes a crash if p
happens to point to i
, possibly because two cores write to the same memory location at the same time. The fact that this also happens to be undefined in the specific case that *p
is explicitly written out as i
, to get i = i++
, logically follows.