Is it legal to index into a struct?

后端 未结 10 1818
悲哀的现实
悲哀的现实 2020-11-30 01:12

Regardless of how \'bad\' the code is, and assuming that alignment etc are not an issue on the compiler/platform, is this undefined or broken behavior?

If I have a s

10条回答
  •  悲哀的现实
    2020-11-30 02:12

    In ISO C99/C11, union-based type-punning is legal, so you can use that instead of indexing pointers to non-arrays (see various other answers).

    ISO C++ doesn't allow union-based type-punning. GNU C++ does, as an extension, and I think some other compilers that don't support GNU extensions in general do support union type-punning. But that doesn't help you write strictly portable code.

    With current versions of gcc and clang, writing a C++ member function using a switch(idx) to select a member will optimize away for compile-time constant indices, but will produce terrible branchy asm for runtime indices. There's nothing inherently wrong with switch() for this; this is simply a missed-optimization bug in current compilers. They could compiler Slava' switch() function efficiently.


    The solution/workaround to this is to do it the other way: give your class/struct an array member, and write accessor functions to attach names to specific elements.

    struct array_data
    {
      int arr[3];
    
      int &operator[]( unsigned idx ) {
          // assert(idx <= 2);
          //idx = (idx > 2) ? 2 : idx;
          return arr[idx];
      }
      int &a(){ return arr[0]; } // TODO: const versions
      int &b(){ return arr[1]; }
      int &c(){ return arr[2]; }
    };
    

    We can have a look at the asm output for different use-cases, on the Godbolt compiler explorer. These are complete x86-64 System V functions, with the trailing RET instruction omitted to better show what you'd get when they inline. ARM/MIPS/whatever would be similar.

    # asm from g++6.2 -O3
    int getb(array_data &d) { return d.b(); }
        mov     eax, DWORD PTR [rdi+4]
    
    void setc(array_data &d, int val) { d.c() = val; }
        mov     DWORD PTR [rdi+8], esi
    
    int getidx(array_data &d, int idx) { return d[idx]; }
        mov     esi, esi                   # zero-extend to 64-bit
        mov     eax, DWORD PTR [rdi+rsi*4]
    

    By comparison, @Slava's answer using a switch() for C++ makes asm like this for a runtime-variable index. (Code in the previous Godbolt link).

    int cpp(data *d, int idx) {
        return (*d)[idx];
    }
    
        # gcc6.2 -O3, using `default: __builtin_unreachable()` to promise the compiler that idx=0..2,
        # avoiding an extra cmov for idx=min(idx,2), or an extra branch to a throw, or whatever
        cmp     esi, 1
        je      .L6
        cmp     esi, 2
        je      .L7
        mov     eax, DWORD PTR [rdi]
        ret
    .L6:
        mov     eax, DWORD PTR [rdi+4]
        ret
    .L7:
        mov     eax, DWORD PTR [rdi+8]
        ret
    

    This is obviously terrible, compared to the C (or GNU C++) union-based type punning version:

    c(type_t*, int):
        movsx   rsi, esi                   # sign-extend this time, since I didn't change idx to unsigned here
        mov     eax, DWORD PTR [rdi+rsi*4]
    

提交回复
热议问题