Is it always safe to convert an integer value to void* and back again in POSIX?

僤鯓⒐⒋嵵緔 提交于 2019-11-28 09:40:55

As you say, C99 doesn't guarantee that any integer type may be converted to void* and back again without loss of information. It does make a similar guarantee for intptr_t and uintptr_t defined in <stdint.h>, but those types are optional. (The guarantee is that a void* may be converted to {u,}intptr_t and back without loss of information; there's no such guarantee for arbitrary integer values.)

POSIX doesn't appear to make any such guarantee either.

The POSIX description of <limits.h> requires int and unsigned int to be at least 32 bits. This exceeds the C99 requirement that they be at least 16 bits. (Actually, the requirements are in terms of ranges, not sizes, but the effect is that int and unsigned int must be at least 32 (under POSIX) or 16 (under C99) bits, since C99 requires a binary representation.)

The POSIX description of <stdint.h> says that intptr_t and uintptr_t must be at least 16 bits, the same requirement imposed by the C standard. Since void* can be converted to intptr_t and back again without loss of information, this implies that void* may be as small as 16 bits. Combine that with the POSIX requirement that int is at least 32 bits (and the POSIX and C requirement that long is at least 32 bits), and it's possible that a void* just isn't big enough to hold an int or long value without loss of information.

The POSIX description of pthread_create() doesn't contradict this. It merely says that arg (the void* 4th argument to pthread_create()) is passed to start_routine(). Presumably the intent is that arg points to some data that start_routine() can use. POSIX has no examples showing the usage of arg.

You can see the POSIX standard here; you have to create a free account to access it.

The focus in answers so far seems to be on the width of a pointer, and indeed as @Nico points out (and @Quantumboredom also points out in a comment), there is a possibility that intptr_t may be wider than a pointer. @Kevin's answer hints at the other important issue, but doesn't completely describe it.

Also, though I'm not sure of the exact paragraph in the standard, Harbison & Steele point out that intptr_t and uintptr_t are optional types too and may not even exist in a valid C99 implementation. OpenGroup says that XSI-conformant systems must support both types, but that means plain POSIX therefore does does not require them (at least as of the 2003 edition).

The part that's really been missed here though is that pointers need not always have a simple numerical representation that matches the internal representation of an integer. This has always been so (since K&R 1978), and I'm pretty sure POSIX is careful not to overrule this possibility either.

So, C99 does require that it be possible to convert a pointer to an intptr_t IFF that type exists, and then back to a pointer again such that the new pointer will still point at the same object in memory as the old pointer, and indeed if pointers have a non-integer representation this implies that an algorithm exists which can convert a a specific set of integer values into valid pointers. However this also means that not all integers between INTPTR_MIN and INTPTR_MAX are necessarily valid pointer values, even if the width of intptr_t (and/or uintptr_t) is exactly the same as the width of a pointer.

So, the standards cannot guarantee that any intptr_t or uintptr_t can be converted to a pointer and back to the same integer value, or even which set of integer values can survive such conversion, because they cannot possibly define all of the possible rules and algorithms for converting integer values into pointer values. Doing so even for all known architectures could still prevent the applicability of the standard to novel types of architectures yet to be invented.

(u)intptr_t are only guarateed to be large enough to hold a pointer, but they may also be "larger", which is why the C99 standard only guarantees (void*)->(u)intptr_t->(void*), but in the other case loss of data may occur (and is considered undefined).

Not sure what you mean by "always". It's not written anywhere in the standard that this is okay, but there are no systems it fails on.

If your integers are really small (say limited to 16bit) you can make it strictly conforming by declaring:

static const char dummy_base[65535];

and then passing dummy_base+i as the argument and recovering it as i=(char *)start_arg-dummy_base;

I think your answer is in the text you quoted:

If the result cannot be represented in the integer type, the behavior is undefined. The result need not be in the range of values of any integer type.

So, not necessarily. Say you had a 64-bit long and cast it to a void* on a 32-bit machine. The pointer is likely 32 bits, so either you lose the top 32 bits or get INT_MAX back. Or, potentially, something else entirely (undefined, as the standard says).

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!