What does “representable” mean in C11?

天涯浪子 提交于 2019-11-28 01:11:59

Under the assumption that char is signed then this would be undefined behavior, otherwise it is well defined since CHAR_MIN would have the value 0. It is easier to see the intention and meaning of:

the value of which shall be representable as an unsigned char or shall equal the value of the macro EOF

if we read section 7.4 Character handling <ctype.h> from the Rationale for International Standard—Programming Languages—C which says (emphasis mine going forward):

Since these functions are often used primarily as macros, their domain is restricted to the small positive integers representable in an unsigned char, plus the value of EOF. EOF is traditionally -1, but may be any negative integer, and hence distinguishable from any valid character code. These macros may thus be efficiently implemented by using the argument as an index into a small array of attributes.

So valid values are:

  1. Positive integers that can fit into unsigned char
  2. EOF which is some implementation defined negative number

Even though this is C99 rationale since the particular wording you are referring to does not change from C99 to C11 and so the rationale still fits.

We can also find why the interface uses int as an argument as opposed to char, from section 7.1.4 Use of library functions, it says:

All library prototypes are specified in terms of the “widened” types an argument formerly declared as char is now written as int. This ensures that most library functions can be called with or without a prototype in scope, thus maintaining backwards compatibility with pre-C89 code. Note, however, that since functions like printf and scanf use variable-length argument lists, they must be called in the scope of a prototype.

What does representable in a type mean?

Re-formulated, a type is a convention for what the underlying bit-patterns mean. A value is thus representable in a type, if that type assigns some bit-pattern that meaning.

A conversion (which might need a cast), is a mapping from a value (represented with a specific type) to a value (possibly different) represented in the target type.


Under the given assumption (that char is signed), CHAR_MIN is certainly negative, and the text you quoted leaves no room for interpretation:
Yes, it is undefined behavior, as unsigned char cannot represent any negative numbers.

If that assumption did not hold, your program would be well-defined, because CHAR_MIN would be 0, a valid value for unsigned char.

Thus, we have a case where it is implementation-defined whether the program is undefined or well-defined.


As an aside, there is no guarantee that sizeof(int)>1 or INT_MAX >= CHAR_MAX, so int might not be able to represent all values possible for unsigned char.

As conversions are defined to be value-preserving, a signed char can always be converted to int.
But if it was negative, that does not change the impossibility of representing a negative value as an unsigned char. (The conversion is defined, as conversion from any integral type to any unsigned integral type is always defined, though narrowing conversions need a cast.)

The revealing quote (for me) is §6.3.1.3/1:

if the value can be represented by the new type, it is unchanged.

i.e., if the value has to be changed then the value can't be represented by the new type.

Therefore an unsigned type can't represent a negative value.

To answer the question in the title: "representable" refers to "can be represented" from §6.3.1.3 and unrelated to "object representation" from §6.2.6.1.

It seems trivial in retrospect. I might have been confused by the habit of treating b'\xFF', 0xff, 255, -1 as the same byte in Python:

>>> (255).to_bytes(1, 'big')
b'\xff'
>>> int.from_bytes(b'\xFF', 'big')
255
>>> 255 == 0xff
True
>>> (-1).to_bytes(1, 'big', signed=True)
b'\xff'

and the disbelief that it is an undefined behavior to pass a character to a character classification function e.g., isspace(CHAR_MIN).

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!