Introduction: This question is part of my collection of C and C++ (and C/C++ common subset) questions regarding the cases where pointers object with strictly ide
You have proven that it seems to work on a specific implementation. That doesn't mean that it works in general. In fact, it is undefined behavior where one possible outcome is exactly "seems to work".
If, we go back to the MS-DOS era we had near pointers (relative to a specific segment) and far pointers (containing both a segment and an offset).
Large arrays were often allocated in their own segment and only the offset was used as a pointer. The compiler already knew what segment contained a specific array, so it could combine the pointer with the proper segment register.
In that case, you could have two pointers with the same bit-pattern, where one pointer pointed into an array segment (pa) and another pointer pointed into the stack segment (pb). The pointers compared equal, but still pointed to different things.
To make it worse, far pointers with a segment:offset pair could be formed with overlapping segments so that different bit-patterns still pointed to the same physical memory address. For example 0100:0210 is the same address as 0120:0010.
The C and C++ languages are designed so that this can work. That's why we have rules that comparing pointers only works (gives a total order) within the same array, and that pointers might not point to the same thing, even if they contain the same bit-pattern.
I say no, without resorting to the UB tarpit. From the following code:
extern int f(int x[3], int y[4]);
....
int a[7];
return f(a, a) + f(a+4, a+3);
...
The C standard should not prevent me from writing a compiler which performs bounds checking; there are several available. A bounds checking compiler would have to fatten the pointers by augmenting them with bounds information (*). So when we get to f():
....
if (x == y) {
....
F() would be interested in the C notion of equality, that is do they point at the same location, not do they have identical types. If you aren’t happy with this, suppose f() called g(int *s, int *t), and it contained a similar test. The compiler would perform the comparison without comparing the fat.
The pointer size sizeof(int *), would have to include the fat, so memcmp of two pointers would compare it as well, thus providing a different result from the compare.
PS: should we introduce a new tag for navel gazing?
*pa1 = 2; // does pa1 legally point to b?
No, that pa1 points to b is purely coincidental. Note that a program must conform at compilation, that the pointer happens to have the same value in runtime doesn't matter.
Nobody can tell the difference, no?
The compiler optimizer can tell the difference!
The compiler optimizer can see (through static analysis of the code) that b and is never accessed through a "legal" pointer, so it assumes is safe to keep b in a register. This decision is made at compilation.
Bottom line:
"Legal" pointers are pointers obtained from a legal pointer by assignment or by copying the memory. You can also obtain a "legal" pointer using pointer arithmetic, provided the resulting pointer is within the legal range of the array/memory block it was assigned/copied from. If the result of pointer arithmetic happens to point to a valid address in another memory block, the use of such a pointer is still UB.
Also note that pointer comparison is valid only if the two pointers are pointing to same array/memory block.
EDIT:
Where did it go wrong?
The standard states that accessing an array out-of-bounds results in undefined behaviour. You took the address of an out-of-bounds by one pointer, copied it and then dereferenced it.
The standard states that an out-of-bounds pointer may compare equal to a pointer to another object that happens to be placed adjacent in memory (6.5.9 pt 6). However, even though they compare equal, semantically they don't point to the same object.
In your case, you don't compare the pointers, you compare their bit patterns. Doesn't matter. The pointer pa1 is still considered to be a pointer to one past the end of an array.
Note that if you replace memcpy with some function you write yourself, the compiler won't know what value pa1 has but it can still statically determine that it cannot contain a "legally" obtained copy of &b.
Thus, the compiler optimizer is allowed to optimize the read/store of b in this case.
is a pointer's semantic "value" (its behavior according to the specification) determined only by its numerical value (the numerical address it contains), for a pointer of a given type?
No. The standard infers that valid pointers can only be obtained from objects using the address-of operator (&), by copying another valid pointer or by in/decreasing a pointer inside the bounds of an array. As a special case, pointers one past the end of an array are valid but they must not be dereferenced. This might seem a bit strict but without it the possibility to optimize would be limited.
if not, it is possible to copy only the physical address contained in a pointer while leaving out the associated semantic?
No, at least not in a way that is portable to any platform. In many implementations the pointer value is just the address. The semantics is in the generated code.