How to Find the Intersection of Two Stl Sets

How to find the intersection of two STL sets?

You haven't provided an output iterator for set_intersection

template <class InputIterator1, class InputIterator2, class OutputIterator>
OutputIterator set_intersection ( InputIterator1 first1, InputIterator1 last1,
InputIterator2 first2, InputIterator2 last2,
OutputIterator result );

Fix this by doing something like

...;
set<int> intersect;
set_intersection(s1.begin(), s1.end(), s2.begin(), s2.end(),
std::inserter(intersect, intersect.begin()));

You need a std::insert iterator since the set is as of now empty. We cannot use std::back_inserter or std::front_inserter since set doesn't support those operations.

How do I find the intersection of 2 sets?

You can do it by using set_intersection, you will find an example there how to use it:

#include <iostream>
#include <vector>
#include <algorithm>
#include <iterator>
int main()
{
std::vector<int> v1{2, 3, 5, 7, 11};;
std::vector<int> v2{1, 3, 5, 7, 9, 11};
std::sort(v1.begin(), v1.end());
std::sort(v2.begin(), v2.end());

std::vector<int> v_intersection;

std::set_intersection(v1.begin(), v1.end(),
v2.begin(), v2.end(),
std::back_inserter(v_intersection));
for(int n : v_intersection)
std::cout << n << ' ';
}

Results will be:

 3 5 7 11

How to compute the size of an intersection of two STL sets in C++

It's not difficult to write a loop that moves through the two sets looking for matching elements, or you could do this, which is much simpler than a custom iterator:

struct Counter
{
struct value_type { template<typename T> value_type(const T&) { } };
void push_back(const value_type&) { ++count; }
size_t count = 0;
};

template<typename T1, typename T2>
size_t intersection_size(const T1& s1, const T2& s2)
{
Counter c;
set_intersection(s1.begin(), s1.end(), s2.begin(), s2.end(), std::back_inserter(c));
return c.count;
}

Attempting to find the intersection of two sets of Points, error No viable overloaded '='

You are trying to assign to a const Point.

Elements of a set can not be changed through iterators. Sets (and maps) are implemented through red-black trees, and the position of an element in the tree depends on the value of the key (sets have only keys). If you could modify the key, the tree would have to detect this and rearrange itself, or it would break.

Even though std::set<Point> has a separate iterator and a const_iterator, the data type of std::set<Point>::iterator::operator*() (the result of *output) is const Point.

This has been true since 1998 (https://cplusplus.github.io/LWG/issue103)

If you want an out parameter of type Point, use a reference, not a set iterator.

In-place C++ set intersection

I think I've got it:

std::set<int>::iterator it1 = set_1.begin();
std::set<int>::iterator it2 = set_2.begin();
while ( (it1 != set_1.end()) && (it2 != set_2.end()) ) {
if (*it1 < *it2) {
set_1.erase(it1++);
} else if (*it2 < *it1) {
++it2;
} else { // *it1 == *it2
++it1;
++it2;
}
}
// Anything left in set_1 from here on did not appear in set_2,
// so we remove it.
set_1.erase(it1, set_1.end());

Anyone see any problems? Seems to be O(n) on the size of the two sets. According to cplusplus.com, std::set erase(position) is amortized constant while erase(first,last) is O(log n).

Efficient intersection of two sets

For sets that are implemented as binary trees, there actually is an algorithm that combines the benefits of both the procedures you mention. Essentially, you do a merge like std::set_intersection, but while iterating in one tree, you skip any branches that are all less than the current value in the other.

The resulting intersection takes O(min(n1 log n2, n2 log n1, n1 + n2), which is just what you want.

Unfortunately, I'm pretty sure std::set doesn't provide interfaces that could support this operation.

I've done it a few times in the past though, when working on joining inverted indexes and similar things. Usually I make iterators with a skipTo(x) operation that will advance to the next element >= x. To meet my promised complexity it has to be able to skip N elements in log(N) amortized time. Then an intersection looks like this:

void get_intersection(vector<T> *dest, const set<T> set1, const set<T> set2)
{
auto end1 = set1.end();
auto end2 = set2.end();
auto it1 = set1.begin();
if (it1 == end1)
return;
auto it2 = set2.begin();
if (it2 == end2)
return;
for (;;)
{
it1.skipTo(*it2);
if (it1 == end1)
break;
if (*it1 == *it2)
{
dest->push_back(*it1);
++it1;
}
it2.skipTo(*it1);
if (it2 == end2)
break;
if (*it2 == *it1)
{
dest->push_back(*it2);
++it2;
}
}
}

It easily extends to an arbitrary number of sets using a vector of iterators, and pretty much any ordered collection can be extended to provide the iterators required -- sorted arrays, binary trees, b-trees, skip lists, etc.

Is it possible to find the intersection of 3 sets in c++ in a single line?

You can, but it'd be an extremely long line that would piss off anyone who would try to read it.

As for a more "elegant" approach, you should split up what you're trying to do into functions:

    vector <int> getIntersection(vector < vector <int> > &sets) 
{
vector <int> result; // To store the reaultant set
int smallSetInd = 0; // Initialize index of smallest set
int minSize = sets[0].size(); // Initialize size of smallest set

// sort all the sets, and also find the smallest set
for (int i = 1 ; i < sets.size() ; i++)
{
// sort this set
sort(sets[i].begin(), sets[i].end());

// update minSize, if needed
if (minSize > sets[i].size())
{
minSize = sets[i].size();
smallSetInd = i;
}
}

map<int,int> elementsMap;

// Add all the elements of smallest set to a map, if already present,
// update the frequency
for (int i = 0; i < sets[smallSetInd].size(); i++)
{
if (elementsMap.find( sets[smallSetInd][i] ) == elementsMap.end())
elementsMap[ sets[smallSetInd][i] ] = 1;
else
elementsMap[ sets[smallSetInd][i] ]++;
}

// iterate through the map elements to see if they are present in
// remaining sets
map<int,int>::iterator it;
for (it = elementsMap.begin(); it != elementsMap.end(); ++it)
{
int elem = it->first;
int freq = it->second;

bool bFound = true;

// Iterate through all sets
for (int j = 0 ; j < sets.size() ; j++)
{
// If this set is not the smallest set, then do binary search in it
if (j != smallSetInd)
{
// If the element is found in this set, then find its frequency
if (binary_search( sets[j].begin(), sets[j].end(), elem ))
{
int lInd = lower_bound(sets[j].begin(), sets[j].end(), elem)
- sets[j].begin();
int rInd = upper_bound(sets[j].begin(), sets[j].end(), elem)
- sets[j].begin();

// Update the minimum frequency, if needed
if ((rInd - lInd) < freq)
freq = rInd - lInd;
}
// If the element is not present in any set, then no need
// to proceed for this element.
else
{
bFound = false;
break;
}
}
}

// If element was found in all sets, then add it to result 'freq' times
if (bFound)
{
for (int k = 0; k < freq; k++)
result.push_back(elem);
}
}
return result;
}

and then in your main (or wherever you need to get the intersection) do something like:

vector < vector <int> > sets; 
vector <int> set1;
set1.push_back(1);
set1.push_back(1);
set1.push_back(2);
set1.push_back(2);
set1.push_back(5);

sets.push_back(set1);

vector <int> set2;
set2.push_back(1);
set2.push_back(1);
set2.push_back(4);
set2.push_back(3);
set2.push_back(5);
set2.push_back(9);

sets.push_back(set2);

vector <int> set3;
set3.push_back(1);
set3.push_back(1);
set3.push_back(2);
set3.push_back(3);
set3.push_back(5);
set3.push_back(6);

sets.push_back(set3);

vector <int> r = getIntersection(sets);

And now you have a vector of the intersection between however many sets you put in. It can be 3, it can be 30,000, whatever.

C++ library method for intersection of two unordered_set

In fact, a loop-based solutions is the best thing you can use with std::unordered_set.

There is an algorithm called std::set_intersection which allows to find an intersection of two sorted ranges:

Constructs a sorted range beginning at d_first consisting of elements
that are found in both sorted ranges [first1, last1) and [first2,
last2).

As you deal with std::unordered_set, you cannot apply this algorithm because there is no guaranteed order for the elements in std::unordered_set.

My advice is to stick with loops as it explicitly says what you want to achieve and has a linear complexity (O(N), where N is a number of elements in the unordered set you traverse with a for loop) which is the best compexity you might achieve.

Efficient & safe intersection of more than two sets

As you can read on cppreference,

[...] The resulting range cannot overlap with either of the input ranges.

so you're in undefined behavior land.

As a proof by verification of this, I can tell you that I've copied your code, compiled it, run it, and for me it prints 23, so your correct result is just a coincidence.

Therefore, it looks like to have to rely on another temporary.

The STL doesn't seem to contain a solution for intersecting more than two sets, and you can't even use std::set_intersection in a nested fashion (e.g. result = my_set_intersection(set_1, my_set_intersection(set_2,set_3)), the reason being pretty simple: the algorithm's interface is "tainted" by iterators, i.e. it takes begin and end iterators to the sets, rather than the sets themselves as inputs; and it also returns an iterator.

Porbably Boost has something useful, but I haven't found it yet.



Related Topics



Leave a reply



Submit