How to Use Unordered_Set with Custom Types

How to use unordered_set with custom types?

The standard library contains specialisations of std::hash<T> for the fundamental types, for pointers and for std::string (or rather, for all specializations of std::basic_string).

Unfortunately the library does not contain the following vital new-from-old combination function, which is however part of Boost, and which you should copy into your code:

template <class T>
inline void hash_combine(std::size_t & seed, const T & v)
{
std::hash<T> hasher;
seed ^= hasher(v) + 0x9e3779b9 + (seed << 6) + (seed >> 2);
}

With this function, you can hash pairs, tuples, arrays, and any sort of range of elements that are themselves hashable. Browse the Boost sources for many examples and useful implementations. And obviously you can use this function to create a hash function for your own types. For example, here's hashing a pair:

template<typename S, typename T> struct pair_hash<std::pair<S, T>>
{
inline std::size_t operator()(const std::pair<S, T> & v) const
{
std::size_t seed = 0;
hash_combine(seed, v.first);
hash_combine(seed, v.second);
return seed;
}
};

Please be aware, though, that hash-combining does not produce good hash values. The results have very poor statistic qualities (e.g. it is very easy to create hash collisions). Good hashing needs to be able to see all the raw input bits, and cannot be factored through partial hashes. (That's why there isn't a better solution in the current standard library; nobody has been able to come up with a satisfactory design.)

How can I use an unordered_set with a custom struct?

The second template parameter to std::unordered_set is the type to use for hashing. and will default to std::hash<Point> in your case, which doesn't exist. So you can use std::unordered_set<Point,Point> if the hasher is the same type.

Alternatively if you do not want to specify the hasher, define a specialization of std::hash for Point and either get rid of the member function and implement the hashing in the body of your specialization's operator(), or call the member function from the std::hash specialization.

#include <unordered_set>

struct Point {
int X;
int Y;

Point() : X(0), Y(0) {};
Point(const int& x, const int& y) : X(x), Y(y) {};
Point(const Point& other){
X = other.X;
Y = other.Y;
};

Point& operator=(const Point& other) {
X = other.X;
Y = other.Y;
return *this;
};

bool operator==(const Point& other) const {
if (X == other.X && Y == other.Y)
return true;
return false;
};

bool operator<(const Point& other) {
if (X < other.X )
return true;
else if (X == other.X && Y == other.Y)
return true;

return false;
};

// this could be moved in to std::hash<Point>::operator()
size_t operator()(const Point& pointToHash) const noexcept {
size_t hash = pointToHash.X + 10 * pointToHash.Y;
return hash;
};

};

namespace std {
template<> struct hash<Point>
{
std::size_t operator()(const Point& p) const noexcept
{
return p(p);
}
};
}

int main()
{
// no need to specify the hasher if std::hash<Point> exists
std::unordered_set<Point> p;
return 0;
}

Demo

How to using boost::unordered_set with custom class?

You are trying to use boost::unordered_set<myclass>, which will internally use boost::hash<myclass>, which will look for a hash_value(myclass) function in the same namespace as myclass via Argument-Dependent Lookup. You made your hash_value() be a non-static member of myclass, so boost::hash will not be able to find it. But even if it could, it expects your hash_value() to take a single myclass object as a parameter, not a vector.

See Extending boost::hash for a custom data type in Boost's documentation.

Also, a class's operator== compares *this to another object. Inside of myclass, your operator== should take a single myclass object as a parameter, not a vector.

Try this instead:

struct E {
string name;
int scope;
};

size_t hash_value(const E& obj) {
std::size_t seed = 0;
boost::hash_combine(seed, obj.name);
boost::hash_combine(seed, obj.scope);
return seed;
}

class myclass {
private:
vector<E> vec;
public:
myclass(const vector<E>& v) : vec(v) {}

bool operator==(const myclass& rhs) const {
// vector has its own operator== for comparing elements in its array...
return vec == rhs.vec;
}

friend size_t hash_value(const myclass& obj) {
return boost::hash_range(obj.vec.begin(), obj.vec.end());
}
};
void test(std::vector<std::vector<E>>& A)
{
boost::unordered_set<myclass> input_records(A.size());

for (BOOST_AUTO(it, A.begin()); it != (A.end());) {
auto k = input_records.insert(*it);
...
}
}

Why can't I store my objects in an unordered_set?

To use type in unordered_set or unordered_map you need hashing function for your type. For common types, like int or std::string - hashing function is provided by standard library. For your type, you can overload standard std::hash, like this:

namespace std {
template <> struct hash<someType> {
size_t operator()(const someType & x) const {
std::hash<std::string> h;
return h(x.name);
// or simply return x.code
// or do something more interesting,
// like xor'ing hashes from both members of struct
}
};
}

Another way is to provide your own type with overloaded operator() and put it as hash template argument in unordered_set, like this:

struct someTypeHasher {
size_t operator()(const someType& x) const {
return x.code;
}
};
std::unordered_set<someType, someTypeHasher> myset;

Good reading for theory about hash based containers is here

Also, do not forget, that you need to overload operator== for someType, without it - it will also not work.

Why does a unordered_set with a custom hash function and custom class need an initial number of buckets?

That's simply because there is no such constructor.

The only unordered_set constructor that takes one parameter is the one that takes an instance of a custom allocator, and not a custom hash function.

P.S. You fail to initialize hash to 0, in your custom hash function. This carries an elevated risk of nasal demons. You should fix that.

Unable to find a user-defined type in a c++ unordered set with custom operator==()

You are trying to compare pointers, not values.
You need to specify hashing function for class Block.

For example, if you want to use mName as key the code will be the following:

class Block {
private:
string mName;
string mLib;
public:
Block(string const& name, string const& lib)
{
mName = name;
mLib = lib;
}
string getName() const {
return mName;
};
string getLib() const {
return mLib;
}
bool operator==(const Block & block) const;
};

template<> struct std::hash<Block> {
std::size_t operator()(const Block & block) const noexcept {
return std::hash<std::string>{}(block.getName());
}
};

bool Block::operator==(const Block & block) const {
return block.getName() == mName;
}

int main() {
std::vector<Block> vertices;

vertices.emplace_back(Block("mod1", "work"));
vertices.emplace_back(Block("mod2", "work"));
vertices.emplace_back(Block("mod3", "work"));

std::unordered_set<Block> undefs;
undefs.emplace(Block("mod1", "work"));
undefs.emplace(Block("mod2", "work"));

for (auto& vertex : vertices) {
auto search = undefs.find(vertex);
if (search != undefs.end()) {
std::cout << "Block: " << vertex.getName() << "\n";
}
}
}

Access to custom objects in unordered_set

A pointer doesn't project its constness to the object it points to. Meaning, if you have a constant reference to a std::shared_ptr (as in a set) you can still modify the object via this pointer. Whether or not that is something you should do a is a different question and it doesn't solve your lookup problem.

OF course, if you want to lookup a value by a key, then this is what std::unordered_map was designed for so I'd have a closer look there. The main problem I see with this approach is not so much the memory overhead (unordered_set and unordered_map as well as shared_ptr have noticeable memory overhead anyway), but that you have to maintain redundant information (id in the object and id as a key).

If you have not many insertions and you don't absolutely need the (on average) constant lookup time and memory overhead is really important to you, you could consider a third solution (besides using a third-party or self written data structure of courses): namely to write a thin wrapper around a sorted std::vector<std::shared_ptr<MyClass>> or - if appropriate - even better std::vector<std::unique_ptr<MyClass>> that uses std::upper_bound for lookups.



Related Topics



Leave a reply



Submit