Std::Set with User Defined Type, How to Ensure No Duplicates

I want use set to remove duplicate element and keep the order when they insert

std::set relies on the comparator to maintain strict weak ordering and ensure each value is unique. You can't have a std::set sort in the order they are inserted.

One possible solution is to have two containers, a std::set to contain the unique elements and a std::vector index to keep the order they were inserted. The vector could perhaps contain iterators into the set.

It might be convenient to encapsulate the two containers in your own class with its own iterator. Here is a bare-bones implementation:

class MySetIterator {
std::vector<std::set<int>::iterator>::iterator pos;
public:
MySetIterator(std::vector<std::set<int>::iterator>::iterator pos) : pos(pos) {}
int operator*() { return **pos; }
MySetIterator& operator++() { ++pos; return *this; }
bool operator!=(const MySetIterator& rhs) { return pos != rhs.pos; }
};

class MySet {
std::set<int> vals;
std::vector<std::set<int>::iterator> order;
public:
void insert(int val) {
auto ret = vals.insert(val);
if (ret.second)
order.push_back(ret.first);
}
MySetIterator begin() { return {order.begin()}; }
MySetIterator end() { return {order.end()}; }
};

int main() {
MySet my_set;

my_set.insert(5);
my_set.insert(3);
my_set.insert(9);
my_set.insert(1);
my_set.insert(5);
my_set.insert(5);
for (int val : my_set)
std::cout << val << " ";
}

C++ feature, like std::set, which allows duplicates

You can use std::multiset. That should work for what you are describing.

How to use insert in the set in c++ for user defined data type?

std::vector already has an ordering - lexicographical order - so you normally don't need to do anything with that.

You always need to define an ordering for your own classes if you use the default vector ordering (see example below for a case where you don't need to), and the most common way is to overload operator<.

Note that the ordering relation must be a strict weak ordering, or using the set is undefined.

If you want a special sense of "equality" for the set, you need to define your own.

For example, this code would make a set where vectors of equal length are considered equal (so only the first one encountered of each length is added to the set):

template<typename T>
struct shorter_vector
{
bool operator() (const std::vector<T>& left, const std::vector<T>& right) const
{
return left.size() < right.size();
}

};

// ...
struct A { int x; };
std::set<std::vector<A>, shorter_vector<A>> samelengths;
samelengths.insert({A{1}});
samelengths.insert({A{2}});
samelengths.insert({A{3},A{4}});
samelengths.insert({A{5},A{67}});
// set now contains {A{1}} and {A{3},A{4}}

Note that this set doesn't need an ordering for the vector's elements, since the equivalence relation is defined on structure alone.

set with custom struct contains duplicates

Your comparison function should return whether some element is smaller than another, not whether or not they are equal. (More formally, it must define a "Strict weak ordering" on the elements of your set.)

Use something like

struct compare {
bool operator() (const number &lhs, const number& rhs) const{
return std::tie(lhs.a, lhs.b) < std::tie(rhs.a, rhs.b);
}
};

If you don't care about ordering, you may want to define a suitable hash function for your type and use std::unordered_set.

To avoid future problems like this, make sure to read the docs. They clearly explain what your comparison function is supposed to do.

For reference: std::tie as used above constructs tuples of references to its arguments which can then be compared lexicographically with <. This is an easy, generic and fast way to build some ordering for collections of less-than-comparable stuff.

Which operator needs to be overridden in order to use std::set in the C++ code?

You don't have to override any operator, the std::set class template allows you to provide a comparison function as a template parameter. But if you were to provide an operator, the one needed is bool operator<(). This operator has to implement strict weak ordering. See this std::set documentation.

The reason strict weak ordering is used is because set is an ordered container, typically implemented as a self-balancing binary tree. So it is not enough to know whether two elements are the same or not. The set must be able to order them. And the less than operator or the comparator functor are also used to test for element equality.

Can I use std::next on user defined class

// How to implement a forward iterator for your collection class

template <typename T>
class MyCollection
{
public:

struct MyCollectionIterator
{
// These five typedefs tell other things about your iterator

using iterator_category = std::forward_iterator_tag;
using value_type = T;
using difference_type = std::ptrdiff_t;
using pointer = T*;
using reference = T&;

explicit MyCollectionIterator( ... ) ... {}

// These five methods implement the minimum required behavior of a forward iterator

reference operator * () const {...}
iterator & operator ++ () {...}
iterator operator ++ (int) {...}

bool operator == ( iterator that ) {...}
bool operator != ( iterator that ) {...}
};

MyCollectionIterator begin() { return MyCollectionIterator(...); }
MyCollectionIterator end() { return MyCollectionIterator(...); }
};

There are other iterator types beyond a forward iterator. If possible, you should implement the most capable iterator type you can: if not random access, then bidirectional, and if not bidirectional, then forward.

Iterators have been an increasingly frightening thing to look at in C++ (see docs here), but the basic idea is simply a class that knows how to pretend to be a pointer sufficient to access your collection’s data. To do that it must provide certain kinds of information and capabilities.

That little table of iterator types in the linked docs will help you when adding the required functionality to your iterator class.



Related Topics



Leave a reply



Submit