Disclaimer: This is trying to drill down on a larger problem, so please don't get hung up with whether the example makes any sense in practice.
And, yes, if you want to copy objects, please use / provide the copy-constructor. (But note how even the example does not copy a whole object; it tries to blit some memory over a few adjacent(Q.2) integers.)
Given a C++ Standard Layout struct, can I use memcpy to write to multiple (adjacent) sub-objects at once?
Complete example: ( https://ideone.com/1lP2Gd https://ideone.com/YXspBk)
#include <vector>
#include <iostream>
#include <assert.h>
#include <inttypes.h>
#include <stddef.h>
#include <memory.h>
struct MyStandardLayout {
char mem_a;
int16_t num_1;
int32_t num_2;
int64_t num_3;
char mem_z;
MyStandardLayout()
: mem_a('a')
, num_1(1 + (1 << 14))
, num_2(1 + (1 << 30))
, num_3(1LL + (1LL << 62))
, mem_z('z')
{ }
void print() const {
std::cout <<
"MySL Obj: " <<
mem_a << " / " <<
num_1 << " / " <<
num_2 << " / " <<
num_3 << " / " <<
mem_z << "\n";
}
};
void ZeroInts(MyStandardLayout* pObj) {
const size_t first = offsetof(MyStandardLayout, num_1);
const size_t third = offsetof(MyStandardLayout, num_3);
std::cout << "ofs(1st) = " << first << "\n";
std::cout << "ofs(3rd) = " << third << "\n";
assert(third > first);
const size_t delta = third - first;
std::cout << "delta = " << delta << "\n";
const size_t sizeAll = delta + sizeof(MyStandardLayout::num_3);
std::cout << "sizeAll = " << sizeAll << "\n";
std::vector<char> buf( sizeAll, 0 );
memcpy(&pObj->num_1, &buf[0], sizeAll);
}
int main()
{
MyStandardLayout obj;
obj.print();
ZeroInts(&obj);
obj.print();
return 0;
}
Given the wording in the C++ Standard:
9.2 Class Members
...
13 Nonstatic data members of a (non-union) class with the same access control (Clause 11) are allocated so that later members have higher addresses within a class object. (...) Implementation alignment requirements might cause two adjacent members not to be allocated immediately after each other; (...)
I would conclude that it is guaranteed that num_1 to num_3 have increasing addresses and are adjacent modulo padding.
For the above example to be fully defined, I see these requirements, of which I am not sure they hold:
memcpymust be allowed to write to multiple "memory objects" in this way at once, i.e. specifically- Calling
memcpywith the target address ofnum_1and a size that is larger than the size of thenum_1"object" is legal. (Given thatnum_1is not part of an array.) (Is memcpy(&a + 1, &b + 1, 0) defined in C11? seems a good related question, but doesn't quite fit.) - The C++ (14) Standard, AFAICT, refers description of
memcpyto the C99 Standard, and that one states:
7.21.2.1 The memcpy function
2 The memcpy function copies n characters from the object pointed to by s2 into the object pointed to by s1.
So for me the question here wrt. this is whether the target range we have here can be considered "an object" according to the C or C++ Standard. Note: A (part of an) array of chars, declared and defined as such, certainly can be assumed to count as "an object" for the purposes of
memcpybecause I'm pretty sure I'm allowed to copy from one part of a char array to another part of (another) char array.So then the question would be if it is legal to reinterpret the memory range of the three members as a "conceptual"(?) char array.
- Calling
Calculating
sizeAllis legal, that is usage ofoffsetofis legal as shown.Writing to the padding in between the members is legal.
Do these properties hold? Have I missed anything else?
§8.5
(6.2) — if T is a (possibly cv-qualified) non-union class type, each non-static data member and each base-class subobject is zero-initialized and padding is initialized to zero bits;
Now the standard does not actually say that these zero-bits will be writeable, but I can't think of an architecture that has this level of granularity on memory access permissions (nor would we want one to).
So I would say that in practice this re-writing zeros will always be safe, even if not specifically declared so by the Powers that Be.
is legal to reinterpret the memory range of the three members as a "conceptual"(?) char array
No, arbitrary subsets of members of objects are not themselves an object of any kind. If you can't take the sizeof something, it's not a thing. Similarly, as suggested by the link you provided, if you can't identify the thing to std::is_standard_layout, it's not a thing.
Analogous would be
size_t n = (char*)&num_3 - (char*)&num_1;
It would compile, but it's UB: subtracted pointers must belong to the same object.
That said, I think you're in safe territory even if the standard isn't explicit. If MyStandardLayout is a standard layout, it stands to reason that a subset of it also is, even if it has no name and is not an identifiable type of its own.
But I wouldn't do it. Assignment is absolutely safe, and potentially faster than memcpy. If the subset is meaningful and has many members, I would consider making it an explicit struct, and using assignment instead of memcpy, taking advantage of the default member-wise copy constructor supplied by the compiler.
Putting this as a partial answer wrt. memcpy(&num_1, buf, sizeAll):
Note: James' answer is much more concise and definitive.
I asked:
memcpymust be allowed to write to multiple "memory objects" in this way at once, i.e. specifically
- Calling
memcpywith the target address ofnum_1and a size that is larger than the size of thenum_1"object" is legal.- The [C++ (14) Standard][2], AFAICT, refers description of
memcpyto the [C99 Standard][3], and that one states:7.21.2.1 The memcpy function
2 The memcpy function copies n characters from the object pointed to by s2 into the object pointed to by s1.
So for me the question here wrt. this is whether the target range we have here can be considered "an object" according to the C or C++ Standard.
Thinking and searching a bit more, I found in the C Standard:
§ 6.2.6 Representations of types
§ 6.2.6.1 General
2 Except for bit-fields, objects are composed of contiguous sequences of one or more bytes, the number, order, and encoding of which are either explicitly specified or implementation-defined.
So at least it is implied that "an object" => "contiguous sequence of bytes".
I'm not so bold to claim that the inverse -- "contiguous sequence of bytes" => "an object" -- holds, but at least "an object" doesn't seem to be defined more strictly here.
Then, as quoted in Q, §9.2/13 of the C++ Standard (and § 1.8/5) seem to guarantee that we do have a contiguous sequence of bytes (including padding).
Then, §3.9/3 says:
3 For any trivially copyable type T, if two pointers to T point to distinct T objects obj1 and obj2, where neither obj1 nor obj2 is a base-class subobject, if the underlying bytes (1.7) making up obj1 are copied into obj2, obj2 shall subsequently hold the same value as obj1. [ Example:
T* t1p; T* t2p; // provided that t2p points to an initialized object ... std::memcpy(t1p, t2p, sizeof(T)); // at this point, every subobject of trivially copyable type in *t1p contains // the same value as the corresponding subobject in *t2p—end example ]
So this explicitly allows the application of memcpy to whole objects of Trivially Copyable types.
In the example, the three members comprise a "trivially copyable sub-object", and indeed I think wrapping them in an actual subobject of distinct type would still mandate exactly the same memory layout for the explicit object as for the three members:
struct MyStandardLayout_Flat {
char mem_a;
int16_t num_1;
int32_t num_2;
int64_t num_3;
char mem_z;
};
struct MyStandardLayout_Sub {
int16_t num_1;
int32_t num_2;
int64_t num_3;
};
struct MyStandardLayout_Composite {
char mem_a;
// Note that the padding here is different from the padding in MyStandardLayout_Flat, but that doesn't change how num_* are layed out.
MyStandardLayout_Sub nums;
char mem_z;
};
The memory layout of nums in _Composite and the three members of _Flat should be layed out completely the same, because the same basic rules apply.
So in conclusion, given that the "sub object" num_1 to num_3 will be represented by an equivalent contiguous sequence of bytes as a full Trivially Copyable sub-object, I:
- have a very, very hard time imagining an implementation or optimizer that breaks this
- Would say it either can be:
- read as Undefined Behavior, iff we conclude that C++§3.9/3 implies that only (full) objects of Trivially Copyable Type are allowed to be be treated thusly by
memcpyor conclude from C99§6.2.6.1/2 and the general spec ofmemcpy7.21.2.1 that the contiguous sequence of the num_* bytes does not comprise a "valid object" for the purposes of memcopy. - read as Defined Behavior, iff we conclude that C++§3.9/3 does not normatively limit the applicability of
memcpyto other types or memory ranges and conclude that the definition ofmemcpy(and the "object term") in the C99 Standard allows to treat adjacent variables as a single object contiguous bytes target.
- read as Undefined Behavior, iff we conclude that C++§3.9/3 implies that only (full) objects of Trivially Copyable Type are allowed to be be treated thusly by
来源:https://stackoverflow.com/questions/39026871/can-i-use-memcpy-to-write-to-multiple-adjacent-standard-layout-sub-objects