问题
I've recently got into some pieces of code doing some questionable 2D arrays indexing operations. Considering as an example the following code sample:
int a[5][5];
a[0][20] = 3;
a[-2][15] = 4;
a[5][-3] = 5;
Are the indexing operations above subject to undefined behavior?
回答1:
It's undefined behavior, and here's why.
Multidimensional array access can be broken down into a series of single-dimensional array accesses. In other words, the expression a[i][j]
can be thought of as (a[i])[j]
. Quoting C11 §6.5.2.1/2:
The definition of the subscript operator
[]
is thatE1[E2]
is identical to(*((E1)+(E2)))
.
This means the above is identical to *(*(a + i) + j)
. Following C11 §6.5.6/8 regarding addition of an integer and pointer (emphasis mine):
If both the pointer operand and the result point to elements of the same array object, or one past the last element of the array object, the evaluation shall not produce an overflow; otherwise, the behavior is undefined.
In other words, if a[i]
is not a valid index, the behavior is immediately undefined, even if "intuitively" a[i][j]
seems in-bounds.
So, in the first case, a[0]
is valid, but the following [20]
is not, because the type of a[0]
is int[5]
. Therefore, index 20 is out of bounds.
In the second case, a[-1]
is already out-of-bounds, thus already UB.
In the last case, however, the expression a[5]
points to one past the last element of the array, which is valid as per §6.5.6/8:
... if the expression
P
points to the last element of an array object, the expression(P)+1
points one past the last element of the array object ...
However, later in that same paragraph:
If the result points one past the last element of the array object, it shall not be used as the operand of a unary * operator that is evaluated.
So, while a[5]
is a valid pointer, dereferencing it will cause undefined behavior, which is caused by the final [-3]
indexing (which, is also out-of-bounds, therefore UB).
回答2:
Yes, this is undefined behaviour.
回答3:
array indexing with negative indexes is undefined behaviour. Sorry, that a[-3]
is the same as *(&a - 3)
in most architectures/compilers, and accepted without warning, but the C language allows you to add negative integers to pointers, but not use negative values as array indexes. Of curse this is not even checked at runtime.
Also, there are some issues to be acquainted for when defining arrays in front to pointers. You can leave unspecified just the first subindex, and no more, like in:
int a[][3][2]; /* array of unspecified size, definition is alias of int (*a)[3][2]; */
(indeed, the above is a pointer definition, not an array, just print sizeof a
)
or
int a[4][3][2]; /* array of 24 integers, size is 24*sizeof(int) */
when you do this, the way to evaluate the offset is different for arrays than for pointers, so be carefull. In case of arrays, int a[I][J][K];
&a[i][j][k]
is placed at
&a + i*(sizeof(int)*J*K) + j*(sizeof(int)*K) + k*(sizeof(int))
but when you declare
int ***a;
then a[i][j][k]
is the same as:
*(*(*(&a+i)+j)+k)
, meaning you have to dereference pointer a
, then add (sizeof(int **))*i
to its value, then dereference again, then add (sizeof (int *))*j
to that value, then dereference it, and add (sizeof(int))*k
to that value to get the exact address of the data.
BR
来源:https://stackoverflow.com/questions/25139579/2d-array-indexing-undefined-behavior