The standard way of intersecting two sets in C++ is to do the following:
std::set set_1; // With some elements
std::set set_2; // Wit
You can easily go through set_1
, check each element to see if it exists in set_2
, and erase it if it doesn't. Since sets are sorted, you can compare them in linear time, and erasing an element using an iterator is amortized constant time. I wouldn't count on it being more efficient than what you started with though, benchmarking would be wise if it matters to you.
I think I've got it:
std::set<int>::iterator it1 = set_1.begin();
std::set<int>::iterator it2 = set_2.begin();
while ( (it1 != set_1.end()) && (it2 != set_2.end()) ) {
if (*it1 < *it2) {
set_1.erase(it1++);
} else if (*it2 < *it1) {
++it2;
} else { // *it1 == *it2
++it1;
++it2;
}
}
// Anything left in set_1 from here on did not appear in set_2,
// so we remove it.
set_1.erase(it1, set_1.end());
Anyone see any problems? Seems to be O(n) on the size of the two sets. According to cplusplus.com, std::set erase(position) is amortized constant while erase(first,last) is O(log n).
It's not directly answers the question, but maybe someone find this helpful.
In case of std::vector
it is not safe to use standard algorithm with set_1.begin()
as output iterator (see below), while clang/gcc/microsoft implementations would work. Note, set_2
could be anything, not just a std::vector
.
std::vector<int> set_1; // With some elements
std::vector<int> set_2; // With some other elements
auto end = std::set_intersection(
set_1.begin(), set_1.end(),
set_2.begin(), set_2.end(),
set_1.begin() // intersection is written in set_1
);
set_1.erase(end, set_1.end()); // erase redundant elements
Update:
Thanks to @Keith who found that C++ Standard (25.4.5.3) requires next:
The resulting range shall not overlap with either of the original ranges
So what I initially proposed was wrong, but working solution in major STL implementations. If you want to be on safe side and don't want extra allocations then copy implementation of your choice to you code base and use it instead of std::set_intersection
. I don't really understand reasons for such restriction, please comment if you know the answer.