问题
Context:
I was reviewing some code that receives data from an IO descriptor into a character buffer, does some control on it and then use part of the received buffer to populate a struct, and suddenly wondered whether a strict aliasing rule violation could be involved.
Here is a simplified version
#define BFSZ 1024
struct Elt {
int id;
...
};
unsigned char buffer[BFSZ];
int sz = read(fd, buffer, sizeof(buffer)); // correctness control omitted for brievety
// search the beginning of struct data in the buffer, and process crc control
unsigned char *addr = locate_and_valid(buffer, sz);
struct Elt elt;
memcpy(&elt, addr, sizeof(elt)); // populates the struct
// and use it
int id = elt.id;
...
So far, so good. Provide the buffer did contain a valid representation of the struct - say it has been produced on same platform, so without endianness or padding problem - the memcpy call has populated the struct and it can safely be used.
Problem:
If the struct is dynamically allocated, it has no declared type. Let us replace last lines with:
struct Elt *elt = malloc(sizeof(struct Element)); // no declared type here
memcpy(elt, addr, sizeof(*elt)); // populates the newly allocated memory and copies the effective type
// and use it
int id = elt->id; // strict aliasing rule violation?
...
Draft n1570 for C language says in 6.5 Expressions §6
The effective type of an object for an access to its stored value is the declared type of the object, if any.87) If a value is stored into an object having no declared type through an lvalue having a type that is not a character type, then the type of the lvalue becomes the effective type of the object for that access and for subsequent accesses that do not modify the stored value. If a value is copied into an object having no declared type using memcpy or memmove, or is copied as an array of character type, then the effective type of the modified object for that access and for subsequent accesses that do not modify the value is the effective type of the object from which the value is copied, if it has one.
buffer does have an effective type and even a declared type: it is an array of unsigned char. That is the reason why the code uses a memcpy instead of a mere aliasing like:
struct Elt *elt = (struct Elt *) addr;
which would indeed be a strict aliasing rule violation (and could additionaly come with alignment problems). But if memcpy has given an effective type of an unsigned char array to the zone pointed by elt, everything is lost.
Question:
Does memcpy from an array of character type to a object with no declared type give an effective type of array of character?
Disclaimer:
I know that it works without a warning with all common compilers. I just want to know whether my understanding of standard is correct
In order to better show my problem, let us considere a different structure Elt2 with sizeof(struct Elt2)<= sizeof(struct Elt), and
struct Elt2 actual_elt2 = {...};
For static or automatic storage, I cannot reuse object memory:
struct Elt elt;
struct Elt2 *elt2 = &elt;
memcpy(elt2, &actual_elt2, sizeof(*elt2));
elt2->member = ... // strict aliasing violation!
While it is fine for dynamic one (question about it there):
struct Elt *elt = malloc(sizeof(*elt));
// use elt
...
struct Elt2 *elt2 = elt;
memcpy(elt2, &actual_elt2, sizeof(*elt2));
// ok, memory now have struct Elt2 effective type, and using elt would violate strict aliasing rule
elt2->member = ...; // fine
elt->id = ...; // strict aliasing rule violation!
What could make copying from a char array different?
回答1:
The code is fine, no strict aliasing violation. The pointed-at data has an effective type, so the bold cited text does not apply. What applies here is the part you left out, last sentence of 6.5/6:
For all other accesses to an object having no declared type, the effective type of the object is simply the type of the lvalue used for the access.
So the effective type of the pointed-at object becomes struct Elt. The returned pointer of malloc does indeed point to an object with no delcared type, but as soon as you point at it, the effective type becomes that of the struct pointer. Otherwise C programs would not be able to use malloc at all.
What makes the code safe is also that you are copying data into that struct. Had you instead just assigned a struct Elt* to point at the same memory location as addr, then you would have a strict aliasing violation and UB.
回答2:
Lundin's answer is correct; what you are doing is fine (so long as the data is aligned and of same endianness).
I want to note this is not so much a result of the C language specification as it is a result of how the hardware works. As such, there's not a single authoritative answer. The C language specification defines how the language works, not how the language is compiled or implemented on different systems.
Here is an interesting article about memory alignment and strict aliasing on a SPARC versus Intel processor (notice the exact same C code performs differently, and gives errors on one platform while working on another): https://askldjd.com/2009/12/07/memory-alignment-problems/
Fundamentally, two identical structs, on the same system with the same endian and memory alignment, must work via memcpy. If it didn't then the computer wouldn't be able to do much of anything.
Finally, the following question explains more about memory alignment on systems, and the answer by joshperry should help explain why this is a hardware issue, not a language issue: Purpose of memory alignment
来源:https://stackoverflow.com/questions/48559902/is-it-safe-to-memcpy-to-a-dynamic-storage-struct