In-place C++ set intersection

坚强是说给别人听的谎言 提交于 2019-12-30 00:52:29

问题


The standard way of intersecting two sets in C++ is to do the following:

std::set<int> set_1;  // With some elements
std::set<int> set_2;  // With some other elements
std::set<int> the_intersection;  // Destination of intersect
std::set_intersection(set_1.begin(), set_1.end(), set_2.begin(), set_2.end(), std::inserter(the_intersection, the_intersection.end()));

How would I go about doing an in-place set intersection? That is, I want set_1 to have the results of the call to set_intersection. Obviously, I can just do a set_1.swap(the_intersection), but this is a lot less efficient than intersecting in-place.


回答1:


I think I've got it:

std::set<int>::iterator it1 = set_1.begin();
std::set<int>::iterator it2 = set_2.begin();
while ( (it1 != set_1.end()) && (it2 != set_2.end()) ) {
    if (*it1 < *it2) {
        set_1.erase(it1++);
    } else if (*it2 < *it1) {
        ++it2;
    } else { // *it1 == *it2
            ++it1;
            ++it2;
    }
}
// Anything left in set_1 from here on did not appear in set_2,
// so we remove it.
set_1.erase(it1, set_1.end());

Anyone see any problems? Seems to be O(n) on the size of the two sets. According to cplusplus.com, std::set erase(position) is amortized constant while erase(first,last) is O(log n).




回答2:


You can easily go through set_1, check each element to see if it exists in set_2, and erase it if it doesn't. Since sets are sorted, you can compare them in linear time, and erasing an element using an iterator is amortized constant time. I wouldn't count on it being more efficient than what you started with though, benchmarking would be wise if it matters to you.




回答3:


It's not directly answers the question, but maybe someone find this helpful.

In case of std::vector it is not safe to use standard algorithm with set_1.begin() as output iterator (see below), while clang/gcc/microsoft implementations would work. Note, set_2 could be anything, not just a std::vector.

std::vector<int> set_1;  // With some elements
std::vector<int> set_2;  // With some other elements
auto end = std::set_intersection(
                     set_1.begin(), set_1.end(), 
                     set_2.begin(), set_2.end(), 
                     set_1.begin() // intersection is written in set_1
                    );
set_1.erase(end, set_1.end()); // erase redundant elements

Update:

Thanks to @Keith who found that C++ Standard (25.4.5.3) requires next:

The resulting range shall not overlap with either of the original ranges

So what I initially proposed was wrong, but working solution in major STL implementations. If you want to be on safe side and don't want extra allocations then copy implementation of your choice to you code base and use it instead of std::set_intersection. I don't really understand reasons for such restriction, please comment if you know the answer.



来源:https://stackoverflow.com/questions/1773526/in-place-c-set-intersection

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!