问题
Since it's not clear what's undefined behaviour and what's not in C, I'm wondering if accessing an array element using a char is or not undefined behaviour. For example:
char c = 'A';
int a[3000];
printf("%i\n", a[c]);
I know that actually chars and ints are somehow interchangeable, but still, I'm not sure.
回答1:
Syntactically, a[c]
is a valid expression as long as c
is an integer type or can be promoted to an integer type.
From the C99 Standard:
6.5.2.1 Array subscripting
1 One of the expressions shall have type ‘‘pointer to object type’’, the other expression shall have integer type, and the result has type ‘‘type’’.
If the value of c
. after is promoted to an int
, is within the bounds of the array, then there should be no problem at run time.
回答2:
Is accessing an array element using a char undefined behaviour?
It is not undefined behavior. It works like another integer type. Yet the numeric value of a char
may surprisingly be negative.
A char
has the same range as signed char
or an unsigned char
. It is implementation defined.
Using c
as an index is fine, if the promoted index plus the pointer results in a valid memory address. Detail: A char
will be promoted to int
, or possible unsigned
.
The following is potentially a problem had c
had a negative value. In OP's case, with ASCII encoding, 'A'
has the value of 65, so it does not have a problem as 0 <= 65 < 3000
. @Joachim Pileborg
char c = 'A';
int a[3000] = { 0 };
printf("%i\n", a[c]); // OK other than a[] not initialize in OP's code.
回答3:
It'll mostly work, but be careful about non-ASCII chars, with value > 127
If the char
is signed, it'll get promoted to a negative integer, causing access to memory outside of the array!
This is a common bug in naïve implementations of e.g. tolower()
回答4:
This should automatically cast to int and go to that element of the array, so the behavior is not undefined. However, there is never really a reason to do this. Even if you start at ' ' (ASCII decimal value 32) you aren't using the other 32 values before it.
I think you are probably trying to make a very basic hash table. This can easily be done with a struct and a few functions; it is usually bad practice to use anything but an integer type (even though a char can be casted to int) as an array subscript.
回答5:
From all I know I'd say it's not undefined, but rather well defined. The reason: A char
may be promoted to an integer
, which is a valid way to index an array (or better said: pointer, which the array decays into in that expression). Indexing is basically the same as addition:
pointer + index // same as &(pointer[index]) or &(index[pointer])
And, quoting http://en.cppreference.com/w/cpp/language/implicit_cast (under "Numeric promotions"):
[..] Prvalues of small integral types (such as
char
) may be converted to prvalues of larger integral types (such asint
). In particular, arithmetic operators do not accept types smaller thanint
as arguments, [..]
AFAIK compilers will emit a warning, though, because usually you don't use a char
as index, thus the compiler tries to provide an extra net of safety.
回答6:
The short answer is: the code fragment does not compile.
The intermediary answer is: if part of a function definition, the code has undefined behavior because it accesses an uninitialized object.
The long answer is: with a properly initialized array, it still depends:
c
in the expressiona[c]
will be promoted toint
prior to computing the array index, and the C Standard mandates that'A'
have a positive value, regardless of whether typechar
is signed or unsigned. If the typechar
has 8 bits, the behavior would not be undefined, but implementation defined as the actual value of'A'
depends on the target architecture.If the
char
type is larger than 11 bits, it would be possible for the value'A'
to exceed3000
and thus for the expression to attempt an access beyond the end of the array, which has undefined behavior.
来源:https://stackoverflow.com/questions/35803605/is-accessing-an-array-element-using-a-char-undefined-behaviour