memory-alignment | 易学教程

MPI derived datatype works for floats, but not for doubles. Is it an alignment issue?

阅读更多关于 MPI derived datatype works for floats, but not for doubles. Is it an alignment issue?

问题 I have a weird issue related to a C-structure that is communicated with the help of an MPI derived datatype. The example below works; it simply sends a message consisting of one integer plus 4 float values. Minmum working example: #include <mpi.h> #include <stdio.h> int main(int argc, char *argv[]) { MPI_Init(&argc, &argv); int i, rank, tag = 1; MPI_Status status; MPI_Comm_rank(MPI_COMM_WORLD, &rank); // Array of doubles plus element count typedef struct { int row; float elements[4]; } My

Is there a reason why NOT to force 8-byte alignment for complex float type?

阅读更多关于 Is there a reason why NOT to force 8-byte alignment for complex float type?

问题 This is a follow-up for this question. We have an implementation of GCC for our embedded architecture. As such we have control over some aspects of the compiler and optimizer. Such aspect may be potentially forcing the 8-byte aligned allocation of complex float objects. Generally speaking, on our architecture we can optimize access to these objects if they are properly aligned, by requiring a single double-load instruction instead of two regular loads. Just before a round of enhancements and

Is the member field order of a class “stable”?

阅读更多关于 Is the member field order of a class “stable”?

问题 Considering c++ (or c++11), where I have some array of data with 2*N integers which represent N pairs. For each even i=0,2,4,6,...,2*N it holds that (data[i],data[i+1]) forms such a pair. Now I want to have a simple way to access these pairs without the need to write loops like: for(int i=0; i<2*N; i+=2) { ... data[i] ... data[i+1] ... } So I wrote this: #include <iostream> struct Pair { int first; int second; }; int main() { int N=5; int data[10]= {1,2,4,5,7,8,10,11,13,14}; Pair *pairs =

Why is the default alignment for `int64_t` 8 byte on 32 bit x86 architecture?

阅读更多关于 Why is the default alignment for `int64_t` 8 byte on 32 bit x86 architecture?

问题 Why is the default alignment 8 byte for int64_t (e.g. long long ) in 32 bit x86 ABIs? 4 byte alignment would appear to be fine, because it can only be accessed as two 4B halves. 回答1: Interesting point: If you only ever load it as two halves into 32bit GP registers, then 4B alignment means those operations will happen with their natural alignment. However, it's probably best if both halves of the variable are in the same cache line, since almost all accesses will read / write both halves.

Store arbitrary elements in contiguous memory

阅读更多关于 Store arbitrary elements in contiguous memory

问题 I am trying to create a data structure, where it will hold N number of different types in contiguous memory. So at compile time I can say I want to store 4 elements of 3 different types, and in memory it will look like 111122223333. I've been going with a variadic template approach, which I think will do what I want, however I am not sure how to add the elements to each array in the add method. template<std::size_t N, typename... Args> class Batch { private: std::tuple<std::array<Args, N>...>

Adding a new attribute on source code that propagates until MC level in LLVM?

阅读更多关于 Adding a new attribute on source code that propagates until MC level in LLVM?

问题 I am interested in how the following is propagated: void foo(int __attribute__((aligned(16)))* p) { ... } In this case the “alignedness” of the pointer is available at the MC level, but it is evidently not using the LLVM-IR metadata approach to achieve this. The alignment information is very important to some targets which will change code-generation dependent on this value, and I think that what I need is more like this attribute. How difficult would it be to add a new attribute such that it

Dynamically allocate properly-aligned memory: is the new-expression on char arrays suitable?

阅读更多关于 Dynamically allocate properly-aligned memory: is the new-expression on char arrays suitable?

问题 I am following Stefanus Du Toit's hourglass pattern, that is, implementing a C API in C++ and then wrapping it in C++ again. This is very similar to the pimpl idiom, and it is also transparent to the user, but prevents more ABI-related issues and allows for a wider range of foreign language bindings. As in the pointer-to-implementation approach, the underlying object's size and layout is not known by the outsiders at compile-time, so the memory in which it resides has to be dynamically

Why is it not possible to read an unaligned word in one step?

阅读更多关于 Why is it not possible to read an unaligned word in one step?

问题 Given that the word size of a CPU allows it to address every single byte in the memory. And given that via PAE CPUs can even use more bits than its word size for addressing. What is the reason that a CPU cannot read an unaligned word in one step? For example, in a 32-bit machine you can read the 4-byte chunk starting at position 0, but you cannot read the one starting at position 1 (you can but it needs several steps). Why can CPUs not do that? 回答1: The problem is not with the ability of the

No Memory Alignment with GCC

阅读更多关于 No Memory Alignment with GCC

问题 I am working with some packet data. I have created structs to hold the packet data. These structs have been generated by python for a specific networking protocol. The issue is that due to the fact that the compiler aligns the structures, when I send the data via the networking protocol, the message ends up being longer than I would like. This causes the other device to not recognize the command. Does anyone know possible a work around this so that my packers are exactly the size the struct

Misaligned address using virtual inheritance

阅读更多关于 Misaligned address using virtual inheritance

问题 The following apparently valid code produces a misaligned address runtime error using the UndefinedBehaviorSanitizer sanitiser. #include <memory> #include <functional> struct A{ std::function<void()> data; // seems to occur only if data is a std::function } ; struct B{ char data; // occurs only if B contains a member variable }; struct C:public virtual A,public B{ }; struct D:public virtual C{ }; void test(){ std::make_shared<D>(); } int main(){ test(); return 0; } Compiling and executing on