Inserting into an Unordered_Set with Custom Hash Function

Inserting into an unordered_set with custom hash function

First problem:

You are passing string as the second template argument for your instantiation of the unordered_set<> class template. The second argument should be the type of your hasher functor, and std::string is not a callable object.

Perhaps meant to write:

unordered_set<Interval, /* string */ Hash> test;
// ^^^^^^^^^^^^
// Why this?

Also, I would suggest using names other than begin and end for your (member) variables, since those are names of algorithms of the C++ Standard Library.

Second problem:

You should keep in mind, that the hasher function should be qualified as const, so your functor should be:

struct Hash {
size_t operator() (const Interval &interval) const {
// ^^^^^
// Don't forget this!
string temp = to_string(interval.b) +
to_string(interval.e) +
to_string(interval.proteinIndex);
return (temp.length());
}
};

Third problem:

Finally, if you want std::unordered_set to be able to work with objects of type Interval, you need to define an equality operator consistent with your hash function. By default, if you do not specify any type argument as the third parameter of the std::unordered_set class template, operator == will be used.

You currently do not have any overload of operator == for your class Interval, so you should provide one. For example:

inline bool operator == (Interval const& lhs, Interval const& rhs)
{
return (lhs.b == rhs.b) &&
(lhs.e == rhs.e) &&
(lhs.proteinIndex == rhs.proteinIndex);
}

Conclusion:

After all the above modifications, your code becomes:

#include <string>
#include <unordered_set>
#include <list>

using namespace std;

struct Interval {
unsigned int b;
unsigned int e;
bool updated; //true if concat. initially false
int patternIndex; //pattern index. valid for single pattern
int proteinIndex; //protein index. for retrieving the pattern
};

bool operator == (Interval const& lhs, Interval const& rhs)
{
return (lhs.b == rhs.b) && (lhs.e == rhs.e) && (lhs.proteinIndex == rhs.proteinIndex);
}

struct Hash {
size_t operator()(const Interval &interval) const {
string temp = to_string(interval.b) + to_string(interval.e) + to_string(interval.proteinIndex);
return (temp.length());
}
};

int main()
{
unordered_set<Interval, Hash> test;

list<Interval> concat;
for(list<Interval>::iterator i = concat.begin(); i != concat.end(); ++i){
test.insert(*i);
}

}

Why does a unordered_set with a custom hash function and custom class need an initial number of buckets?

That's simply because there is no such constructor.

The only unordered_set constructor that takes one parameter is the one that takes an instance of a custom allocator, and not a custom hash function.

P.S. You fail to initialize hash to 0, in your custom hash function. This carries an elevated risk of nasal demons. You should fix that.

How to use unordered_set::find with a custom hash function

You forgot the template parameter pair_hash in

void TimeStep(std::vector<Atom*>& atoms, const std::unordered_set<std::pair<Atom*, Atom*>> bonds)

it probably should be

void TimeStep(std::vector<Atom*>& atoms, const std::unordered_set<std::pair<Atom*, Atom*>, pair_hash> bonds)

Without it std::hash will be used, which is not specialized for std::pair, i.e. std::hash<std::pair<X, Y>> does not exist, thus causing the error.

Is there a default hash function for an unordered_set of a custom class?

If you don't specify your own hash functor as template argument, it will default to std::hash<MyClass>, which does not exist unless you define it.

Best define your own specialization of std::hash inside namespace std:

namespace std {
template <>
struct hash<MyClass>
{
typedef MyClass argument_type;
typedef std::size_t result_type;

result_type operator()(const MyClass & t) const
{
/* ..calculate hash value for t */
}
};
}

And make sure you include this code before the declaration of your hash. This way you can declare the hash simply as std::unordered_set<MyClass> with no need for further template arguments.

You didn't specify what MyClass looks like inside, but a typical situation is that your user-defined type simply consists of several simple-type members, for which a default hash function exists. In this case, you will probably want to combine the hash values for the individual types to a hash value for the entire combination. The Boost library provides a function called hash_combine for this purpose. Of course, there is no guarantee that it will work well in your particular case (it depends on the distribution of data values and the likelihood of collisions), but it provides a good and easy-to-use starting point.

Here is an example of how to use it, assuming MyClass consists of two string members:

#include <unordered_set>
#include <boost/functional/hash.hpp>

struct MyClass
{
std::string _s1;
std::string _s2;
};

namespace std {
template <>
struct hash<MyClass>
{
typedef MyClass argument_type;
typedef std::size_t result_type;

result_type operator()(const MyClass & t) const
{
std::size_t val { 0 };
boost::hash_combine(val,t._s1);
boost::hash_combine(val,t._s2);
return val;
}
};
}

int main()
{
std::unordered_set<MyClass> s;
/* ... */
return 0;
}

How do I specify a custom hash function explicitly for unordered_set by passing a named function?

You need to supply the type of the hasher, which is in your case a function pointer. And your FooBar type must be equality comparable. Or equivalently you could supply a equality predicate in the same manner as supplying the hasher.

#include <unordered_set>
#include <stdint.h>

struct FooBar {
int i;
};

bool operator==(const FooBar& x, const FooBar& y)
{
return x.i == y.i;
}

size_t hashFooBar(const FooBar& foo) {
return foo.i;
}

int main(){
std::unordered_set<FooBar, size_t(*)(const FooBar&)> foo(0, hashFooBar);
}

I should also note that it is more popular to supply a "functor" instead of a function, as the former can be inlined, while the latter is likely to not be inlined.

#include <unordered_set>
#include <stdint.h>

struct FooBar {
int i;
};

bool operator==(const FooBar& x, const FooBar& y)
{
return x.i == y.i;
}

struct hashFooBar
{
size_t operator()(const FooBar& foo) const {
return foo.i;
}
};

int main(){
std::unordered_set<FooBar, hashFooBar> foo(0);
}


Related Topics



Leave a reply



Submit