A commonly-used macro in the linux kernel (and other places) is container_of
, which is (basically) defined as follows:
#define container_of(ptr, type, member) (((type) *)((char *)(ptr) - offsetof((type), (member))))
Which basically allows recovery of a "parent" structure given a pointer to one of its members:
struct foo {
char ch;
int bar;
};
...
struct foo f = ...
int *ptr = &f.bar; // 'ptr' points to the 'bar' member of 'struct foo' inside 'f'
struct foo *g = container_of(ptr, struct foo, bar);
// now, 'g' should point to 'f', i.e. 'g == &f'
However, it's not entirely clear whether the subtraction contained within container_of
is considered undefined behavior.
On one hand, because bar
inside struct foo
is only a single integer, then only *ptr
should be valid (as well as ptr + 1
). Thus, the container_of
effectively produces an expression like ptr - sizeof(int)
, which is undefined behavior (even without dereferencing).
On the other hand, §6.3.2.3 p.7 of the C standard states that converting a pointer to a different type and back again shall produce the same pointer. Therefore, "moving" a pointer to the middle of a struct foo
object, then back to the beginning should produce the original pointer.
The main concern is the fact that implementations are allowed to check for out-of-bounds indexing at runtime. My interpretation of this and the aforementioned pointer equivalence requirement is that the bounds must be preserved across pointer casts (this includes pointer decay - otherwise, how could you use a pointer to iterate across an array?). Ergo, while ptr
may only be an int
pointer, and neither ptr - 1
nor *(ptr + 1)
are valid, ptr
should still have some notion of being in the middle of a structure, so that (char *)ptr - offsetof(struct foo, bar)
is valid (even if the pointer is equal to ptr - 1
in practice).
Finally, I came across the fact that if you have something like:
int arr[5][5] = ...
int *p = &arr[0][0] + 5;
int *q = &arr[1][0];
while it's undefined behavior to dereference p
, the pointer by itself is valid, and required to compare equal to q
(see this question). This means that p
and q
compare the same, but can be different in some implementation-defined manner (such that only q
can be dereferenced). This could mean that given the following:
// assume same 'struct foo' and 'f' declarations
char *p = (char *)&f.bar;
char *q = (char *)&f + offsetof(struct foo, bar);
p
and q
compare the same, but could have different boundaries associated with them, as the casts to (char *)
come from pointers to incompatible types.
To sum it all up, the C standard isn't entirely clear about this type of behavior, and attempting to apply other parts of the standard (or, at least my interpretations of them) leads to conflicts. So, is it possible to define container_of
in a strictly-conforming manner? If so, is the above definition correct?
This was discussed here after comments on my answer to this question.
I think its strictly conforming or there's a big defect in the standard. Referring to your last example, the section on pointer arithmetic doesn't give the compiler any leeway to treat p
and q
differently. It isn't conditional on how the pointer value was obtained, only what object it points to.
Any interpretation that p
and q
could be treated differently in pointer arithmetic would require an interpretation that p
and q
do not point to the same object. Since since there's no implementation dependent behaviour in how you obtained p
and q
then that would mean they don't point to the same object on any implementation. That would in turn require that p == q
be false on all implementations, and so would make all actual implementations non-conforming.
I just want to answer this bit.
int arr[5][5] = ...
int *p = &arr[0][0] + 5;
int *q = &arr[1][0];
This is not UB. It is certain that p is a pointer to an element of the array, provided only that it is within bounds. In each case it points to the 6th element of a 25 element array, and can safely be dereferenced. It can also be incremented or decremented to access other elements of the array.
See n3797 S8.3.4 for C++. The wording is different for C, but the meaning is the same. In effect arrays have a standard layout and are well-behaved with respect to pointers.
Let us suppose for a moment that this is not so. What are the implications? We know that the layout of an array int[5][5] is identical to int[25], there can be no padding, alignment or other extraneous information. We also know that once p and q have been formed and given a value, they must be identical in every respect.
The only possibility is that, if the standard says it is UB and the compiler writer implements the standard, then a sufficiently vigilant compiler might either (a) issue a diagnostic based on analysing the data values or (b) apply an optimisation which was dependent on not straying outside the bounds of sub-arrays.
Somewhat reluctantly I have to admit that (b) is at least a possibility. I am led to the rather strange observation that if you can conceal from the compiler your true intentions this code is guaranteed to produce defined behaviour, but if you do it out in the open it may not.
来源:https://stackoverflow.com/questions/25296019/can-a-container-of-macro-ever-be-strictly-conforming