C++ STL: Array vs Vector: Raw element accessing performance

前端 未结 5 1391
青春惊慌失措
青春惊慌失措 2020-12-02 13:51

I\'m building an interpreter and as I\'m aiming for raw speed this time, every clock cycle matters for me in this (raw) case.

Do you have any experience or informati

5条回答
  •  一生所求
    2020-12-02 14:18

    For decent results, use std::vector as the backing storage and take a pointer to its first element before your main loop or whatever:

    std::vector mem_buf;
    // stuff
    uint8_t *mem=&mem_buf[0];
    for(;;) {
        switch(mem[pc]) {
        // stuff
        }
    }
    

    This avoids any issues with over-helpful implementations that perform bounds checking in operator[], and makes single-stepping easier when stepping into expressions such as mem_buf[pc] later in the code.

    If each instruction does enough work, and the code is varied enough, this should be quicker than using a global array by some negligible amount. (If the difference is noticeable, the opcodes need to be made more complicated.)

    Compared to using a global array, on x86 the instructions for this sort of dispatch should be more concise (no 32-bit displacement fields anywhere), and for more RISC-like targets there should be fewer instructions generated (no TOC lookups or awkward 32-bit constants), as the commonly-used values are all in the stack frame.

    I'm not really convinced that optimizing an interpreter's dispatch loop in this way will produce a good return on time invested -- the instructions should really be made to do more, if it's an issue -- but I suppose it shouldn't take long to try out a few different approaches and measure the difference. As always in the event of unexpected behaviour the generated assembly language (and, on x86, the machine code, as instruction length can be a factor) should be consulted to check for obvious inefficiencies.

提交回复
热议问题