Can calloc() allocate more than SIZE_MAX in total?

问题

In a recent code review, it was claimed that

On select systems, calloc() can allocate more than SIZE_MAX total bytes whereas malloc() is limited.

My claim is that that's mistaken, because calloc() creates space for an array of objects - which, being an array, is itself an object. And no object can be larger in size than SIZE_MAX.

So which of us is correct? On a (possibly hypothetical) system with address space larger than the range of size_t, is calloc() allowed to succeed when called with arguments whose product is greater than SIZE_MAX?

To make it more concrete: will the following program ever exit with a non-zero status?

#include <stdint.h>
#include <stdlib.h>

int main()
{
     return calloc(SIZE_MAX, 2) != NULL;
}

回答1:

SIZE_MAX doesn't necessary specify the maximum size of an object, but rather the maximum value of size_t, which is not necessarily the same thing. See Why is the maximum size of an array "too large"?,

But obviously, it isn't well-defined to pass a larger value than SIZE_MAX to a function expecting a size_t parameter. So in theory SIZE_MAX is the limit, and in in theory calloc would allow for SIZE_MAX * SIZE_MAX bytes to allocated.

The thing with malloc/calloc is that they allocate objects without a type. Objects with a type have restrictions, such as never being larger than a certain limit like SIZE_MAX. But the data pointed-at by the result from these functions does not have a type. It is not (yet) an array.

Formally, the data has no declared type, but as you store something inside the allocated data, it gets the effective type of the data access used for storage (C17 6.5 §6).

This in turn means that it would be possible for calloc to allocate more memory than any type in C can hold, because what's allocated does not (yet) have a type.

Therefore, as far as the C standard is concerned, it is perfectly fine for calloc(SIZE_MAX, 2) to return a value different from NULL. How to actually use that allocated memory in a sensible way, or which systems that even support such large chunks of memory on the heap, is another story.

回答2:

Can calloc() allocate more than SIZE_MAX in total?

As the assertion "On select systems, calloc() can allocate more than SIZE_MAX total bytes whereas malloc() is limited." came from a comment I posted, I will explain my rationale.

size_t

size_t is some unsigned type of at least 16 bits.

size_t which is the unsigned integer type of the result of the sizeof operator; C11dr §7.19 2

"Its implementation-defined value shall be equal to or greater in magnitude ... than the corresponding value given below" ... limit of size_t SIZE_MAX ... 65535 §7.20.3 2

sizeof

The sizeof operator yields the size (in bytes) of its operand, which may be an expression or the parenthesized name of a type. §6.5.3.4 2

calloc

void *calloc(size_t nmemb, size_t size);

The calloc function allocates space for an array of nmemb objects, each of whose size is size. §7.22.3.2 2

Consider a situation where nmemb * size well exceeds SIZE_MAX.

size_t alot = SIZE_MAX/2;
double *p = calloc(alot, sizeof *p); // assume `double` is 8 bytes.

If calloc() truly allocated nmemb * size bytes and if p != NULL is true, what spec did this violate?

The size of each element, (each object) is representable.

// Nicely reports the size of a pointer and an element.
printf("sizeof p:%zu, sizeof *p:%zu\n", sizeof p, sizeof *p);

Each element can be accessed.

// Nicely reports the value of an `element` and the address of the element
for (size_t i = 0; i<alot; i++) {
  printf("value a[%zu]:%g, address:%p\n", i, p[i], (void*) &p[i]); 
}

calloc() details

"space for an array of nmemb objects": This is certainly a key point of contention. Does the "allocates space for the array" require <= SIZE_MAX? I found nothing in the C spec to require this limit and so conclude:

calloc() may allocate more than SIZE_MAX in total.

It is certainly uncommon for calloc() with large arguments to return non-NULL - compliant or not. Usually such allocations exceed memory available, so the issue is moot. The only case I've encountered was with the Huge memory model where size_t was 16 bit and the object pointer was 32 bit.

回答3:

From

7.22.3.2 The calloc function

Synopsis
1
 #include <stdlib.h>
 void *calloc(size_t nmemb, size_t size);`
Description
2 The calloc function allocates space for an array of nmemb objects, each of whose size is size. The space is initialized to all bits zero.

Returns
3 The calloc function returns either a null pointer or a pointer to the allocated space.

I fail to see why the space allocated should be limited to SIZE_MAX bytes.

回答4:

If a program exceeds implementation limits, behavior is undefined. This follows from the definition of an implementation limit as a restriction imposed upon programs by the implementation (3.13 in C11). The standard also says that strictly-conforming programs must adhere to implementation limits (4p5 in C11). But this also implies to programs in general because the standard does not say what happens when most implementation limits are exceeded (so it is the other kind of undefined behavior, where the standard does not specify what happens).

The standard also does not define what implementation limits may exist, so this a bit of carte blanche, but I think it is reasonable that the maximum object size is actually relevant to object allocations. (The maximum object size is typically smaller than SIZE_MAX, by the way, because the difference of pointers-to-char within the object must be representable in ptrdiff_t.)

This leads us to the following observation: A call to calloc (SIZE_MAX, 2) exceeds the maximum object size limit, so an implementation could return an arbitrary value while still conforming to the standard.

Some implementations will actually return a pointer which is not null for a call like calloc (SIZE_MAX / 2 + 2, 2) because the implementation does not check that the multiplication result does not fit into a size_t value. Whether this a good idea is a different matter, given that the implementation limit can be checked so easily in this case, and there is a perfectly fine way to report errors. Personally, I consider the lack of overflow checking in calloc an implementation bug, and have reported bugs to implementors when I saw them, but technically, it's merely a quality-of-implementation issue.

For variable-length arrays on the stack, the rule about exceeding implementation limits resulting in undefined behavior is more obvious:

size_t length = SIZE_MAX / 2 + 2;
short object[length];

There is really nothing an implementation can do here, so it has to be undefined.

回答5:

Per the text of the standard, maybe, because the standard is (some would say intentionally) vague about this sort of thing.

Per 6.5.3.4 ¶2:

The sizeof operator yields the size (in bytes) of its operand

and per 7.19 ¶2:

size_t

which is the unsigned integer type of the result of the sizeof operator;

The former cannot be satisfied in general if the implementation admits any type (including array types) whose size is not representable in size_t. Note that, regardless of whether you interpret the text about the pointer returned by calloc pointing to "an array", there is always an array involved with any object: the overlaid array of type unsigned char[sizeof object] which is its representation.

At best, an implementation that allows the creation of any object larger than SIZE_MAX (or PTRDIFF_MAX, for other reasons) has fatally bad QoI (quality of implementation) problems. The claim on code review that you should account for such bad implementations is bogus unless you are specifically trying to ensure compatibility with a particular broken C implementation (sometimes relevant for embedded, etc.).

回答6:

Just an addition: With a tiny bit of maths you can show that SIZE_MAX * SIZE_MAX = 1 (when evaluated according to C rules).

However, calloc (SIZE_MAX, SIZE_MAX) is only allowed to do one of two things: Return a pointer to an array of SIZE_MAX elements of SIZE_MAX bytes, OR return NULL. It is NOT allowed to calculate the total size by just multiplying the arguments, getting a result of 1, and allocating one byte, cleared to 0.

回答7:

The Standard says nothing about whether it might be possible for a pointer to somehow be created such that ptr+number1+number2 could be a valid pointer, but number1+number2 would exceed SIZE_MAX. It certainly allows for the possibility of number1+number2 exceeding PTRDIFF_MAX (though for some reason C11 has decided to require that even implementations with a 16-bit address space must use a 32-bit ptrdiff_t).

The Standard does not mandate that implementations provide any means of creating pointers to such large objects. It does, however, define a function, calloc(), whose description suggests that it could be asked to attempt to create such an object, and would suggest that calloc() should return a null pointer if it can't create the object.

The ability to allocate any kind of object usefully, however, is a Quality of Implementation issue. The Standard would never require that any particular allocation request succeed, nor would it forbid an implementation from returning a pointer that might turn out to be unusable (in some Linux environments, a malloc() might yield a pointer to an over-committed region of address space; an attempt to use the pointer when insufficient physical storage is available could cause a fatal trap). It would certainly be better for a non-capricious implementation of calloc(x,y) to return null if the numerical product of x and y exceeds SIZE_MAX than for it to yield a pointer which can't be used to access that number of bytes. The Standard is silent, however, whether returning a pointer that can be used to access y objects of x bytes each should be considered be better or worse than returning null. Each behavior would be advantageous in some situations, and disadvantageous in others.

来源：https://stackoverflow.com/questions/52699574/can-calloc-allocate-more-than-size-max-in-total

标签

language-lawyer