问题
I don't understand how a machine can have more than one pointer representation. The following from GNU says
if the target machine has two different pointer representations, the compiler won't know which representation to use for that argument.
How is it possible? What is the relationship between the saying and #define SEM_FAILED ((sem_t*)-1)? What does the latter do? I know it is null pointer which has constant value. But how is it represented in a memory since it points to -1? What about if it points to a right location?
回答1:
One of the very first architectures that C targeted were some with 36-bit or 18-bit words words (the type int). Only the words were directly addressable at addresses like 0, 1, 2 using the native pointers. However one word for one character would have wasted too much memory, so a 9-bit char type was added, with 2 or 4 characters in one word. Since these would not have been addressable by the word pointer, char * was made from two words: one pointing to the word, and another telling which of the bytes within the word should be manipulated.
Of course now the problem is that char * is two words wide, whereas int * is just one, and this matters when calling a function without prototype or with ellipsis - while (void*)0 would have a representation compatible with (char *)0, it wouldn't be compatible with (int *)0, hence an explicit cast is required.
There is another problem with NULL. While GCC seems to assure that NULL will be of type void *, the C standard does not guarantee that, so even using NULL in a function call like execl that expects char *s as variable arguments is wrong without a cast, because an implementation can define
#define NULL 0
(sem_t*)-1 is not a NULL pointer, it is the integer -1 converted to pointer with implementation-defined results. On POSIX systems it will (by necessity) result in an address that can never be a location of any sem_t.
It is actually a really bad convention to use -1 here since the resulting address most likely doesn't have a correct alignment for sem_t, so the entire construct has undefined behaviour in itself.
回答2:
I believe this is alluding to the "near" and "far" pointers found on some 16-bit architectures? From what I understand, they used different offset scalings to work around being stuck with just 64kb of address space.
来源:https://stackoverflow.com/questions/55550918/two-pointer-representation-dilemma