memory-alignment

MPI derived datatype works for floats, but not for doubles. Is it an alignment issue?

元气小坏坏 提交于 2019-12-13 10:33:59
问题 I have a weird issue related to a C-structure that is communicated with the help of an MPI derived datatype. The example below works; it simply sends a message consisting of one integer plus 4 float values. Minmum working example: #include <mpi.h> #include <stdio.h> int main(int argc, char *argv[]) { MPI_Init(&argc, &argv); int i, rank, tag = 1; MPI_Status status; MPI_Comm_rank(MPI_COMM_WORLD, &rank); // Array of doubles plus element count typedef struct { int row; float elements[4]; } My

Is there a reason why NOT to force 8-byte alignment for complex float type?

[亡魂溺海] 提交于 2019-12-13 03:38:23
问题 This is a follow-up for this question. We have an implementation of GCC for our embedded architecture. As such we have control over some aspects of the compiler and optimizer. Such aspect may be potentially forcing the 8-byte aligned allocation of complex float objects. Generally speaking, on our architecture we can optimize access to these objects if they are properly aligned, by requiring a single double-load instruction instead of two regular loads. Just before a round of enhancements and

Is the member field order of a class “stable”?

喜你入骨 提交于 2019-12-12 20:51:48
问题 Considering c++ (or c++11), where I have some array of data with 2*N integers which represent N pairs. For each even i=0,2,4,6,...,2*N it holds that (data[i],data[i+1]) forms such a pair. Now I want to have a simple way to access these pairs without the need to write loops like: for(int i=0; i<2*N; i+=2) { ... data[i] ... data[i+1] ... } So I wrote this: #include <iostream> struct Pair { int first; int second; }; int main() { int N=5; int data[10]= {1,2,4,5,7,8,10,11,13,14}; Pair *pairs =

Why is the default alignment for `int64_t` 8 byte on 32 bit x86 architecture?

和自甴很熟 提交于 2019-12-12 19:40:20
问题 Why is the default alignment 8 byte for int64_t (e.g. long long ) in 32 bit x86 ABIs? 4 byte alignment would appear to be fine, because it can only be accessed as two 4B halves. 回答1: Interesting point: If you only ever load it as two halves into 32bit GP registers, then 4B alignment means those operations will happen with their natural alignment. However, it's probably best if both halves of the variable are in the same cache line, since almost all accesses will read / write both halves.

Store arbitrary elements in contiguous memory

六月ゝ 毕业季﹏ 提交于 2019-12-12 17:12:01
问题 I am trying to create a data structure, where it will hold N number of different types in contiguous memory. So at compile time I can say I want to store 4 elements of 3 different types, and in memory it will look like 111122223333. I've been going with a variadic template approach, which I think will do what I want, however I am not sure how to add the elements to each array in the add method. template<std::size_t N, typename... Args> class Batch { private: std::tuple<std::array<Args, N>...>

Adding a new attribute on source code that propagates until MC level in LLVM?

放肆的年华 提交于 2019-12-12 16:28:10
问题 I am interested in how the following is propagated: void foo(int __attribute__((aligned(16)))* p) { ... } In this case the “alignedness” of the pointer is available at the MC level, but it is evidently not using the LLVM-IR metadata approach to achieve this. The alignment information is very important to some targets which will change code-generation dependent on this value, and I think that what I need is more like this attribute. How difficult would it be to add a new attribute such that it

Dynamically allocate properly-aligned memory: is the new-expression on char arrays suitable?

人盡茶涼 提交于 2019-12-12 15:07:53
问题 I am following Stefanus Du Toit's hourglass pattern, that is, implementing a C API in C++ and then wrapping it in C++ again. This is very similar to the pimpl idiom, and it is also transparent to the user, but prevents more ABI-related issues and allows for a wider range of foreign language bindings. As in the pointer-to-implementation approach, the underlying object's size and layout is not known by the outsiders at compile-time, so the memory in which it resides has to be dynamically

Why is it not possible to read an unaligned word in one step?

六眼飞鱼酱① 提交于 2019-12-12 14:34:22
问题 Given that the word size of a CPU allows it to address every single byte in the memory. And given that via PAE CPUs can even use more bits than its word size for addressing. What is the reason that a CPU cannot read an unaligned word in one step? For example, in a 32-bit machine you can read the 4-byte chunk starting at position 0, but you cannot read the one starting at position 1 (you can but it needs several steps). Why can CPUs not do that? 回答1: The problem is not with the ability of the

No Memory Alignment with GCC

て烟熏妆下的殇ゞ 提交于 2019-12-12 12:19:52
问题 I am working with some packet data. I have created structs to hold the packet data. These structs have been generated by python for a specific networking protocol. The issue is that due to the fact that the compiler aligns the structures, when I send the data via the networking protocol, the message ends up being longer than I would like. This causes the other device to not recognize the command. Does anyone know possible a work around this so that my packers are exactly the size the struct

Misaligned address using virtual inheritance

女生的网名这么多〃 提交于 2019-12-12 07:15:42
问题 The following apparently valid code produces a misaligned address runtime error using the UndefinedBehaviorSanitizer sanitiser. #include <memory> #include <functional> struct A{ std::function<void()> data; // seems to occur only if data is a std::function } ; struct B{ char data; // occurs only if B contains a member variable }; struct C:public virtual A,public B{ }; struct D:public virtual C{ }; void test(){ std::make_shared<D>(); } int main(){ test(); return 0; } Compiling and executing on