As part of answering another question, I came across a piece of code like this, which gcc compiles without complaint.
typedef struct {
struct xyz *z;
} x
You do have to name it. In this:
typedef struct {
struct xyz *z;
} xyz;
will not be able to point to itself as z
refers to some complete other type, not to the unnamed struct you just defined. Try this:
int main()
{
xyz me1;
xyz me2;
me1.z = &me2; // this will not compile
}
You'll get an error about incompatible types.
As the warning says in the second case, struct NOTHING_LIKE_xyz
is an incomplete type, like void
or arrays of unknown size. An incomplete type can only appear in a struct as a type pointed to (C17 6.7.2.1:3), with an exception for arrays of unknown size that are allowed as the last member of a struct, making the struct itself an incomplete type in this case. The code that follows cannot dereference any pointer to an incomplete type (for good reason).
Incomplete types can offer some datatype encapsulation of sorts in C... The corresponding paragraph in http://www.ibm.com/developerworks/library/pa-ctypes1/ seems like a good explanation.
The parts of the C99 standard you are after are 6.7.2.3, paragraph 7:
If a type specifier of the form
struct-or-union identifier
occurs other than as part of one of the above forms, and no other declaration of the identifier as a tag is visible, then it declares an incomplete structure or union type, and declares the identifier as the tag of that type.
...and 6.2.5 paragraph 22:
A structure or union type of unknown content (as described in 6.7.2.3) is an incomplete type. It is completed, for all declarations of that type, by declaring the same structure or union tag with its defining content later in the same scope.
When a variable or field of a structure type is declared, the compiler has to allocate enough bytes to hold that structure. Since the structure may require one byte, or it may require thousands, there's no way for the compiler to know how much space it needs to allocate. Some languages use multi-pass compilers which would be able find out the size of the structure on one pass and allocate the space for it on a later pass; since C was designed to allow for single-pass compilation, however, that isn't possible. Thus, C forbids the declaration of variables or fields of incomplete structure types.
On the other hand, when a variable or field of a pointer-to-structure type is declared, the compiler has to allocate enough bytes to hold a pointer to the structure. Regardless of whether the structure takes one byte or a million, the pointer will always require the same amount of space. Effectively, the compiler can tread the pointer to the incomplete type as a void* until it gets more information about its type, and then treat it as a pointer to the appropriate type once it finds out more about it. The incomplete-type pointer isn't quite analogous to void*, in that one can do things with void* that one can't do with incomplete types (e.g. if p1 is a pointer to struct s1, and p2 is a pointer to struct s2, one cannot assign p1 to p2) but one can't do anything with a pointer to an incomplete type that one could not do to void*. Basically, from the compiler's perspective, a pointer to an incomplete type is a pointer-sized blob of bytes. It can be copied to or from other similar pointer-sized blobs of bytes, but that's it. the compiler can generate code to do that without having to know what anything else is going to do with the pointer-sized blobs of bytes.
The 1st and 2nd cases are well-defined, because the size and alignment of a pointer is known. The C compiler only needs the size and alignment info to define a struct.
The 3rd case is invalid because the size of that actual struct is unknown.
But beware that for the 1st case to be logical, you need to give a name to the struct:
// vvv
typedef struct xyz {
struct xyz *z;
} xyz;
otherwise the outer struct and the *z
will be considered two different structs.
The 2nd case has a popular use case known as "opaque pointer" (pimpl). For example, you could define a wrapper struct as
typedef struct {
struct X_impl* impl;
} X;
// usually just: typedef struct X_impl* X;
int baz(X x);
in the header, and then in one of the .c
,
#include "header.h"
struct X_impl {
int foo;
int bar[123];
...
};
int baz(X x) {
return x.impl->foo;
}
the advantage is out of that .c
, you cannot mess with the internals of the object. It is a kind of encapsulation.
I was wondering about this too. Turns out that the struct NOTHING_LIKE_xyz * z
is forward declaring struct NOTHING_LIKE_xyz
. As a convoluted example,
typedef struct {
struct foo * bar;
int j;
} foo;
struct foo {
int i;
};
void foobar(foo * f)
{
f->bar->i;
f->bar->j;
}
Here f->bar
refers to the type struct foo
, not typedef struct { ... } foo
. The first line will compile fine, but the second will give an error. Not much use for a linked list implementation then.