C++ Vector vs Array (Time)

前端 未结 4 1974

I have got here two programs with me, both are doing exactly the same task. They are just setting an boolean array / vector to the value true. The program using vector takes

相关标签:
4条回答
  • 2020-12-14 12:40

    Other answers are very good, but you could easily answer it yourself by this method.

    ADDED: In response to comments, let me show you what I mean. I'm running VC On Windows, but this works on any language/OS. I took your first program and increased size to 20000 so it would run long enough. Then while it was running, I took several stackshots. They all look like this:

    std::vector<bool,std::allocator<bool> >::begin() line 93 + 25 bytes
    std::vector<bool,std::allocator<bool> >::operator[]() line 132 + 37 bytes
    main() line 24 + 12 bytes
    mainCRTStartup() line 206 + 25 bytes
    KERNEL32! 7c817077()
    

    So what that says is that it is spending essentially all of it's time in the indexing operation on line 24, and the reason it's spending that time is that the [] operator is calling the begin operator. In more detail:

    main() line 24 + 12 bytes
    

    is this code:

    for(int j = 0; j < size; j++){
    ==> v[i] = true;
    }
    

    that calls:

    std::vector<bool,std::allocator<bool> >::operator[]() line 132 + 37 bytes
    

    which is this code (that I reformatted a little bit):

    reference operator[](size_type _P){
    ==> return (*(begin() + _P));
    }
    

    that calls:

    std::vector<bool,std::allocator<bool> >::begin() line 93 + 25 bytes
    

    which is doing this (in more detail):

    92:       iterator begin()
    93:           {return (_First); }
    00402890   push        ebp
    00402891   mov         ebp,esp
    00402893   sub         esp,44h
    00402896   push        ebx
    00402897   push        esi
    00402898   push        edi
    00402899   push        ecx
    0040289A   lea         edi,[ebp-44h]
    0040289D   mov         ecx,11h
    004028A2   mov         eax,0CCCCCCCCh
    004028A7   rep stos    dword ptr [edi]
    004028A9   pop         ecx    <===============
    004028AA   mov         dword ptr [ebp-4],ecx
    004028AD   mov         eax,dword ptr [ebp-4]
    004028B0   mov         eax,dword ptr [eax+4]
    004028B3   pop         edi
    004028B4   pop         esi
    004028B5   pop         ebx
    004028B6   mov         esp,ebp
    004028B8   pop         ebp
    004028B9   ret
    

    What it is doing is writing 68 bytes of 0xCC on the stack (for some debug reason) as part of getting the begin address of the vector, as part of computing the address of v[i], prior to doing the assignment.

    The fraction of time it spends doing this is near 100%, because it was doing it on every one of the several samples that were taken. Could you have guessed that that's what it was spending nearly all of its time doing? I couldn't have.

    This is, of course, a Debug build. If you switch to a Release build, but turn on debugging information, all these functions get inlined and optimized away, so it goes like 30 times faster, and again the stackshots tell exactly what it's doing.

    So - people can tell you what it might be doing, but this shows how to find out for yourself what it is really doing.

    On your environment it will undoubtedly be different.

    0 讨论(0)
  • 2020-12-14 12:47

    You're using std::vector of bool and that's not what you think it is!

    vector of bool is a bastard child template specialization that should never have existed and actually stores 1 bool in each bit. Accesses into it are more complicated due to masking and shifting logic, so will definitely be somewhat slower.

    Click here for some info on vector of bool.

    Also, you may be running an unoptimized build (almost certainly given the times you listed, 27 seconds is outrageous for 4 million iterations). The Standard Template Library relies very heavily on the optimizer to do things like inline function calls and elide temporaries. Lack of this optimization causes an especially heavy performance degradation for vector of bool because it has to return a proxy object when you index into it, because you can't take the address of a bit, so operator [] can't return a reference.

    Click here for more info on proxied containers (The last half is about vector of bool)

    In addition many STL implementations have helpful debugging bits that are not part of the standard which help you catch bugs, but really drag performance down. You'll want to make sure those are disabled in your optimized build.

    Once you turn on the optimizer, have the right settings (i.e. no STL debugging turned on), and are actually testing the same thing in both loops, you will see almost no difference.

    I had to make your loop much larger to test on my machine, but here's two builds of your vector of bool loop on my machine, showing the impact of optimizer flags on STL code

    $ g++ main.cpp 
    $ ./a.out 
    17 seconds.
    $ g++ -O2 main.cpp 
    $ ./a.out 
    1 seconds.
    
    0 讨论(0)
  • 2020-12-14 12:55

    std::vector<bool> is optimized for memory consumption rather than performance.

    You may trick it by using std::vector<int>. You should not have performance drawbacks then.

    0 讨论(0)
  • 2020-12-14 13:06

    Have a look at this

    Using arrays or std::vectors in C++, what’s the performance gap?

    0 讨论(0)
提交回复
热议问题