Is There a Default Hash Function for an Unordered_Set of a Custom Class

Is there a default hash function for an unordered_set of a custom class?

If you don't specify your own hash functor as template argument, it will default to std::hash<MyClass>, which does not exist unless you define it.

Best define your own specialization of std::hash inside namespace std:

namespace std {
template <>
struct hash<MyClass>
{
typedef MyClass argument_type;
typedef std::size_t result_type;

result_type operator()(const MyClass & t) const
{
/* ..calculate hash value for t */
}
};
}

And make sure you include this code before the declaration of your hash. This way you can declare the hash simply as std::unordered_set<MyClass> with no need for further template arguments.

You didn't specify what MyClass looks like inside, but a typical situation is that your user-defined type simply consists of several simple-type members, for which a default hash function exists. In this case, you will probably want to combine the hash values for the individual types to a hash value for the entire combination. The Boost library provides a function called hash_combine for this purpose. Of course, there is no guarantee that it will work well in your particular case (it depends on the distribution of data values and the likelihood of collisions), but it provides a good and easy-to-use starting point.

Here is an example of how to use it, assuming MyClass consists of two string members:

#include <unordered_set>
#include <boost/functional/hash.hpp>

struct MyClass
{
std::string _s1;
std::string _s2;
};

namespace std {
template <>
struct hash<MyClass>
{
typedef MyClass argument_type;
typedef std::size_t result_type;

result_type operator()(const MyClass & t) const
{
std::size_t val { 0 };
boost::hash_combine(val,t._s1);
boost::hash_combine(val,t._s2);
return val;
}
};
}

int main()
{
std::unordered_set<MyClass> s;
/* ... */
return 0;
}

Why does a unordered_set with a custom hash function and custom class need an initial number of buckets?

That's simply because there is no such constructor.

The only unordered_set constructor that takes one parameter is the one that takes an instance of a custom allocator, and not a custom hash function.

P.S. You fail to initialize hash to 0, in your custom hash function. This carries an elevated risk of nasal demons. You should fix that.

Inserting into an unordered_set with custom hash function

First problem:

You are passing string as the second template argument for your instantiation of the unordered_set<> class template. The second argument should be the type of your hasher functor, and std::string is not a callable object.

Perhaps meant to write:

unordered_set<Interval, /* string */ Hash> test;
// ^^^^^^^^^^^^
// Why this?

Also, I would suggest using names other than begin and end for your (member) variables, since those are names of algorithms of the C++ Standard Library.

Second problem:

You should keep in mind, that the hasher function should be qualified as const, so your functor should be:

struct Hash {
size_t operator() (const Interval &interval) const {
// ^^^^^
// Don't forget this!
string temp = to_string(interval.b) +
to_string(interval.e) +
to_string(interval.proteinIndex);
return (temp.length());
}
};

Third problem:

Finally, if you want std::unordered_set to be able to work with objects of type Interval, you need to define an equality operator consistent with your hash function. By default, if you do not specify any type argument as the third parameter of the std::unordered_set class template, operator == will be used.

You currently do not have any overload of operator == for your class Interval, so you should provide one. For example:

inline bool operator == (Interval const& lhs, Interval const& rhs)
{
return (lhs.b == rhs.b) &&
(lhs.e == rhs.e) &&
(lhs.proteinIndex == rhs.proteinIndex);
}

Conclusion:

After all the above modifications, your code becomes:

#include <string>
#include <unordered_set>
#include <list>

using namespace std;

struct Interval {
unsigned int b;
unsigned int e;
bool updated; //true if concat. initially false
int patternIndex; //pattern index. valid for single pattern
int proteinIndex; //protein index. for retrieving the pattern
};

bool operator == (Interval const& lhs, Interval const& rhs)
{
return (lhs.b == rhs.b) && (lhs.e == rhs.e) && (lhs.proteinIndex == rhs.proteinIndex);
}

struct Hash {
size_t operator()(const Interval &interval) const {
string temp = to_string(interval.b) + to_string(interval.e) + to_string(interval.proteinIndex);
return (temp.length());
}
};

int main()
{
unordered_set<Interval, Hash> test;

list<Interval> concat;
for(list<Interval>::iterator i = concat.begin(); i != concat.end(); ++i){
test.insert(*i);
}

}

How can I use an unordered_set with a custom struct?

The second template parameter to std::unordered_set is the type to use for hashing. and will default to std::hash<Point> in your case, which doesn't exist. So you can use std::unordered_set<Point,Point> if the hasher is the same type.

Alternatively if you do not want to specify the hasher, define a specialization of std::hash for Point and either get rid of the member function and implement the hashing in the body of your specialization's operator(), or call the member function from the std::hash specialization.

#include <unordered_set>

struct Point {
int X;
int Y;

Point() : X(0), Y(0) {};
Point(const int& x, const int& y) : X(x), Y(y) {};
Point(const Point& other){
X = other.X;
Y = other.Y;
};

Point& operator=(const Point& other) {
X = other.X;
Y = other.Y;
return *this;
};

bool operator==(const Point& other) const {
if (X == other.X && Y == other.Y)
return true;
return false;
};

bool operator<(const Point& other) {
if (X < other.X )
return true;
else if (X == other.X && Y == other.Y)
return true;

return false;
};

// this could be moved in to std::hash<Point>::operator()
size_t operator()(const Point& pointToHash) const noexcept {
size_t hash = pointToHash.X + 10 * pointToHash.Y;
return hash;
};

};

namespace std {
template<> struct hash<Point>
{
std::size_t operator()(const Point& p) const noexcept
{
return p(p);
}
};
}

int main()
{
// no need to specify the hasher if std::hash<Point> exists
std::unordered_set<Point> p;
return 0;
}

Demo



Related Topics



Leave a reply



Submit