Using an iterator to Divide an Array into Parts with Unequal Size

后端 未结 3 1266
没有蜡笔的小新
没有蜡笔的小新 2020-12-02 02:49

I have an array which I need to divide up into 3-element sub-arrays. I wanted to do this with iterators, but I end up iterating past the end of the array and segfaulting

3条回答
  •  我在风中等你
    2020-12-02 03:35

    There is some disagreement about the most effective way to accomplish this iteration through array partitions.

    First the one time integer modulo method, this must define auto size in addition to the changes in my answer because gcc does not yet support size:

    auto foo = { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 };  
    auto size = distance(cbegin(foo), cend(foo));
    auto bar = cbegin(foo);
    auto finish = prev(cend(foo), size % 3);
    
    for(auto it = size <= 3 ? cend(foo) : next(bar, 3); it != finish; bar = it, it = next(bar, 3)) {
        for_each(bar, it, [](const auto& i) { cout << i << '\t'; });
        cout << endl;
    }
    
    for_each(bar, finish, [](const auto& i) { cout << i << '\t'; });
    cout << endl;
    for_each(finish, cend(foo), [](const auto& i) { cout << i << '\t'; });
    cout << endl;
    

    This creates 112 lines of assembly, most notably the conditional it != finish generates these instructions:

    cmpq    %r12, %r13
    je      .L19
    movq    %r12, %rbx
    jmp     .L10
    

    Second the repeated iterator subtraction using Ben Voigt's try_advance but only with the random access specialization because there is a compiler conflict for random access iterators:

    auto foo = { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 };  
    auto bar = cbegin(foo);
    
    for (auto it = cbegin(foo), end = cend(foo); try_advance(it, 3, end); bar = it) {
        for_each(bar, it, [](const auto& i) { cout << i << '\t'; });
        cout << endl;
    }
    
    for_each(bar, cend(foo), [](const auto& i) { cout << i << '\t'; });
    cout << endl;
    

    This creates 119 lines of assembly, most notably the conditional in try_advance: if (end - it < stride) return false; incurs a per iteration generating the code:

    movq    %r12, %rax
    subq    %rbp, %rax
    cmpq    $11, %rax
    ja      .L3
    

    Upon learning that cmpq is really just a subtract and compare operation I have written some bench-marking code: http://coliru.stacked-crooked.com/a/ad869f69c8dbd96f I needed to use Coliru to be able to turn on optimization, but it keeps giving me bogus increments of my test count for times, I'm not sure what's going on there. What I can say is locally, the repeated iterator subtraction is always faster, sometimes significantly so. Upon learning this I believe that Ben Voigt's answer should be marked as the correct one.

    EDIT:

    I've made an interesting discovery. It's the algorithm that goes first that always looses. I've rewriten the code to swap the first algorithm on each pass. When this is done the integer modulo method always beats the iterator subtraction method as would be suspected by looking at the assembly, again something fishy is going on with Coliru, but you can take this code and run it locally: http://coliru.stacked-crooked.com/a/eb3e0c70cc138ecf


    The next issue is that both of these algorithms are lazy; in the event that size(foo) is a multiple of 3 they allocate an empty vector at the end of the vector. That requires significant branching for the integer modulo algorithm to remedy, but only the simplest of changes for the repeated iterator subtraction algorithm. The resulting algorithms exhibit effectively equal benchmark numbers but the edge goes to the repeated iterator subtraction for simplicity:

    Integer modulo algorithm:

    auto bar = cbegin(foo);
    const auto size = distance(bar, cend(foo));
    
    if (size <= 3) {
        for_each(bar, cend(foo), [](const auto& i) { cout << i << '\t'; });
        cout << endl;
    }
    else {
        auto finish = prev(cend(testValues), (size - 1) % 3 + 1);
    
        for (auto it = next(bar, 3); it != finish; bar = it, advance(it, 3)) {
            for_each(bar, it, [](const auto& i) { cout << i << '\t'; });
            cout << endl;
        }
    
        for_each(bar, finish, [](const auto& i) { cout << i << '\t'; });
        cout << endl;
        for_each(finish, cend(foo), [](const auto& i) { cout << i << '\t'; });
        cout << endl;
    }
    

    Repeated iterator subtraction algorithm:

    auto bar = cbegin(foo);
    
    for (auto it = cbegin(foo); distance(it, cend(foo)) > 3; bar = it) {
        advance(it, 3);
        for_each(bar, it, [](const auto& i) { cout << i << '\t'; });
        cout << endl;
    }
    
    for_each(bar, cend(foo), [](const auto& i) { cout << i << '\t'; });
    cout << endl;
    

    EDIT: Throwing the Remaining Size Algorithm into the hat

    Both the Integer Modulo and Repeated Subtraction Algorithms above suffer from iterating over the input sequence more than once, other than being slower this isn't that serious because currently we're using a Bidirectional Iterator, but should our input iterator fail to qualify for Bidirectional Iterator this would be excessively expensive. Independent of iterator type the Remaining Size Algorithm beats all challengers every time at 10,000,000+ testbench iterations:

    auto bar = cbegin(foo);
    
    for (auto i = size(foo); i > STEP; i -= STEP) {
        for(auto j = 0; j < STEP; ++j, ++bar) cout << *bar << '\t';
        cout << endl;
    }
    
    for(auto i = 0; j < STEP; ++j, ++bar) cout << *bar << '\t';
    cout << endl;
    

    I've again copied my local testing to Coliru, which gives weird results but you can verify locally: http://coliru.stacked-crooked.com/a/361f238216cdbace

提交回复
热议问题