Generic Hash For Tuples in Unordered_Map/Unordered_Set

Generic hash for tuples in unordered_map / unordered_set

This works on gcc 4.5 allowing all c++0x tuples containing standard hashable types to be members of
unordered_map and unordered_set without further ado.
(I put the code in a header file and just include it.)

The function has to live in the std namespace so that it is picked up by
argument-dependent name lookup (ADL).

Is there a simpler solution?

#include <tuple>
namespace std{
namespace
{

// Code from boost
// Reciprocal of the golden ratio helps spread entropy
// and handles duplicates.
// See Mike Seymour in magic-numbers-in-boosthash-combine:
// http://stackoverflow.com/questions/4948780

template <class T>
inline void hash_combine(std::size_t& seed, T const& v)
{
seed ^= std::hash<T>()(v) + 0x9e3779b9 + (seed<<6) + (seed>>2);
}

// Recursive template code derived from Matthieu M.
template <class Tuple, size_t Index = std::tuple_size<Tuple>::value - 1>
struct HashValueImpl
{
static void apply(size_t& seed, Tuple const& tuple)
{
HashValueImpl<Tuple, Index-1>::apply(seed, tuple);
hash_combine(seed, std::get<Index>(tuple));
}
};

template <class Tuple>
struct HashValueImpl<Tuple,0>
{
static void apply(size_t& seed, Tuple const& tuple)
{
hash_combine(seed, std::get<0>(tuple));
}
};
}

template <typename ... TT>
struct hash<std::tuple<TT...>>
{
size_t
operator()(std::tuple<TT...> const& tt) const
{
size_t seed = 0;
HashValueImpl<std::tuple<TT...> >::apply(seed, tt);
return seed;
}

};
}

Standard Conformant code

Yakk points out that specialising things in the std namespace is actually undefined behaviour. If you wish to have a standards conforming solution, then you need to move all of this code into your own namespace and give up any idea of ADL finding the right hash implementation automatically. Instead of :

unordered_set<tuple<double, int> > test_set;

You need:

unordered_set<tuple<double, int>, hash_tuple::hash<tuple<double, int>>> test2;

where hash_tuple is your own namespace rather than std::.

To do this, you first have to declare a hash implementation inside the hash_tuple namespace. This will forward all non tuple types to the std::hash:

namespace hash_tuple{

template <typename TT>
struct hash
{
size_t
operator()(TT const& tt) const
{
return std::hash<TT>()(tt);
}
};
}

Make sure that hash_combine calls hash_tuple::hash and not std::hash

namespace hash_tuple{

namespace
{
template <class T>
inline void hash_combine(std::size_t& seed, T const& v)
{
seed ^= hash_tuple::hash<T>()(v) + 0x9e3779b9 + (seed<<6) + (seed>>2);
}
}

Then include all the other previous code but put it inside namespace hash_tuple and not std::

namespace hash_tuple{

namespace
{
// Recursive template code derived from Matthieu M.
template <class Tuple, size_t Index = std::tuple_size<Tuple>::value - 1>
struct HashValueImpl
{
static void apply(size_t& seed, Tuple const& tuple)
{
HashValueImpl<Tuple, Index-1>::apply(seed, tuple);
hash_combine(seed, std::get<Index>(tuple));
}
};

template <class Tuple>
struct HashValueImpl<Tuple,0>
{
static void apply(size_t& seed, Tuple const& tuple)
{
hash_combine(seed, std::get<0>(tuple));
}
};
}

template <typename ... TT>
struct hash<std::tuple<TT...>>
{
size_t
operator()(std::tuple<TT...> const& tt) const
{
size_t seed = 0;
HashValueImpl<std::tuple<TT...> >::apply(seed, tt);
return seed;
}
};

}

Using tuple in unordered_map

The template arguments for an unordered_map looks like this:

template<

class Key,
class T,
class Hash = std::hash<Key>,
class KeyEqual = std::equal_to<Key>,
class Allocator = std::allocator< std::pair<const Key, T> >
> class unordered_map;

std::hash is not specialized for tuples (scroll down to Standard specializations for library types). Therefore you need to provide your own, something like this:

typedef std::tuple<int, char, char> key_t;

struct key_hash : public std::unary_function<key_t, std::size_t>
{
std::size_t operator()(const key_t& k) const
{
return std::get<0>(k) ^ std::get<1>(k) ^ std::get<2>(k);
}
};
// ..snip..
typedef std::unordered_map<const key_t,data,key_hash,key_equal> map_t;
// ^ this is our custom hash

And finally, as Benjamin Lindley answer already addresses, you need to use std::make_tuple:

// d is data
m[std::make_tuple(1, 'a', 'b')] = d;
auto itr = m.find(std::make_tuple(1, 'a', 'b'));

The code was grabbed from Using a std::tuple as key for std::unordered_map and here is the Live Example.

Building an unordered map with tuples as keys

You need a bit of front matter. Because of the underlying implementation of boost::tuples::tuple, make Edge a structure to allow the overloads to resolve correctly. Otherwise, you'll get no matches for

  • boost::hash_value(const Edge &)
  • operator==(const Edge &, const Edge &)

Code below:

struct Edge {
Edge(double x1, double x2, double x3, double x4)
: tuple(x1,x2,x3,x4) {}
boost::tuples::tuple<double, double, double, double> tuple;
};

// XXX: less than ideal implementation!
bool operator==(const Edge &a, const Edge &b)
{
return a.tuple.get<0>() == b.tuple.get<0>() &&
a.tuple.get<1>() == b.tuple.get<1>() &&
a.tuple.get<2>() == b.tuple.get<2>() &&
a.tuple.get<3>() == b.tuple.get<3>();
}

// XXX: me too!
std::size_t hash_value(const Edge &e)
{
std::size_t seed = 0;
boost::hash_combine(seed, e.tuple.get<0>());
boost::hash_combine(seed, e.tuple.get<1>());
boost::hash_combine(seed, e.tuple.get<2>());
boost::hash_combine(seed, e.tuple.get<3>());
return seed;
}

typedef boost::unordered_map< Edge, int > EdgeMap;

How to unordered_set tuple int,int ?

Put the hash function in namespace boost.

#include <boost/unordered_set.hpp>
#include <boost/tuple/tuple.hpp>
#include <boost/tuple/tuple_comparison.hpp>

using namespace std;
using namespace boost;

namespace boost {
size_t hash_value(tuple<int, int> const & t) {
return get<0>(t) * 10 + get<1>(t) ;
}
}

int main () {
unordered_set< tuple<int, int> > s ;
tuple<int, int> t ;
s.insert(t) ;
}

Unordered Map with three unsigned chars as key

So your basic problem is that you don't have a hasher for std::tuple -- the standard library does not currently come with one by default. I think this is an oversight.

You cannot add a default hasher for std::tuple -- either for all tuples or for your particular list -- because the standard says you aren't allowed to:

17.6.4.2.1 Namespace std [namespace.std]/1 The behavior of a C++ program is undefined if it adds declarations or definitions to
namespace std or to a namespace within namespace std unless otherwise
specified. A program may add a template specialization for any
standard library template to namespace std only if the declaration
depends on a user-defined type and the specialization meets the
standard library requirements for the original template and is not
explicitly prohibited.

So this leaves you with either writing a custom hasher for your tuple type, and passing it to your unordered_map, or using a type that isn't merely std::tuple for your key, or writing a custom hasher that supports every tuple (including recursive tuples) while also supporting hash<T> lookup for non-tuples. A 4th option would be to write your own ADL-based hash system and set it up so that types with valid std::hash specializations use that, with various other fallbacks.

I will illustrate the 3rd option above.

First, you want to be able to combine two hash values in a way that doesn't lead to many spurious collisions.

  inline std::size_t hash_result_combine(std::size_t lhs, std::size_t rhs)
{
return lhs ^( rhs + 0x9e3779b9 + (lhs<<6) + (lhs>>2));
}

this takes two hash outputs, and combines them. We can then chain this any number of times. (constants borrowed from https://stackoverflow.com/a/7111460/1774667 )

Next, some indexing boilerplate. Most generic work with std::tuple requires this stuff:

  template<unsigned...>struct indexes{};
template<unsigned Max, unsigned... Is>struct make_indexes:make_indexes<Max-1,Max-1,Is...> {};
template<unsigned...Is>struct make_indexes<0,Is...>:indexes<Is...>{};

Finally, a hasher class that works with nearly every type:

  struct my_hasher {
template<typename... Args>
std::size_t operator()(indexes<>, std::tuple<Args...> const& tup) const {
return 0;
}
template<unsigned I, unsigned... Is,typename... Args>
std::size_t operator()(indexes<I, Is...>,std::tuple<Args...> const& tup) const {
return hash_result_combine( (*this)(std::get<I>(tup)), (*this)(indexes<Is...>, tup) );
}
template<typename... Args>
std::size_t operator()(std::tuple<Args...> const& tup) const {
return (*this)(make_indexes<sizeof...(Args)>(), tup);
}
template<typename T>
std::size_t operator()(T const& t) const { return std::hash<T>()(t); }
};

you can pass my_hasher to an unordered_set in place of hash<T> for tuples, tuples containing tuples, or even for scalars -- it is very generous.

typedef std::unordered_map<const key_t,IndexGuide, my_hasher> GuideDouble;

and now we properly hash key_t.

Storing elements in an unordered_set vs storing them in an unordered_map

I would choose unordered_map, because I can get a user, given a userid at any time, without any extra work, while with unordered_set I don't have this facility.

As for the mentioned operations, the speed will be almost same.



Related Topics



Leave a reply



Submit