Hash an arbitrary precision value (boost::multiprecision::cpp_int)
You can (ab)use the serialization support:
Support for serialization comes in two forms:
Classesnumber
,debug_adaptor
,logged_adaptor
andrational_adaptor
have "pass through" serialization support which requires the underlying backend to be serializable.Backends
cpp_int
,cpp_bin_float
,cpp_dec_float
andfloat128
have full support for Boost.Serialization.
So, let me cobble something together that works with boost and std unordered containers:
template <typename Map>
void test(Map const& map) {
std::cout << "\n" << __PRETTY_FUNCTION__ << "\n";
for(auto& p : map)
std::cout << p.second << "\t" << p.first << "\n";
}
int main() {
using boost::multiprecision::cpp_int;
test(std::unordered_map<cpp_int, std::string> {
{ cpp_int(1) << 111, "one" },
{ cpp_int(2) << 222, "two" },
{ cpp_int(3) << 333, "three" },
});
test(boost::unordered_map<cpp_int, std::string> {
{ cpp_int(1) << 111, "one" },
{ cpp_int(2) << 222, "two" },
{ cpp_int(3) << 333, "three" },
});
}
Let's forward the relevant hash<>
implementations to our own hash_impl
specialization that uses Multiprecision and Serialization:
namespace std {
template <typename backend>
struct hash<boost::multiprecision::number<backend> >
: mp_hashing::hash_impl<boost::multiprecision::number<backend> >
{};
}
namespace boost {
template <typename backend>
struct hash<multiprecision::number<backend> >
: mp_hashing::hash_impl<multiprecision::number<backend> >
{};
}
Now, of course, this begs the question, how is hash_impl
implemented?
template <typename T> struct hash_impl {
size_t operator()(T const& v) const {
using namespace boost;
size_t seed = 0;
{
iostreams::stream<hash_sink> os(seed);
archive::binary_oarchive oa(os, archive::no_header | archive::no_codecvt);
oa << v;
}
return seed;
}
};
This looks pretty simple. That's because Boost is awesome, and writing a hash_sink
device for use with Boost Iostreams is just the following straightforward exercise:
namespace io = boost::iostreams;
struct hash_sink {
hash_sink(size_t& seed_ref) : _ptr(&seed_ref) {}
typedef char char_type;
typedef io::sink_tag category;
std::streamsize write(const char* s, std::streamsize n) {
boost::hash_combine(*_ptr, boost::hash_range(s, s+n));
return n;
}
private:
size_t* _ptr;
};
Full Demo:
Live On Coliru
#include <iostream>
#include <iomanip>
#include <boost/archive/binary_oarchive.hpp>
#include <boost/multiprecision/cpp_int.hpp>
#include <boost/multiprecision/cpp_int/serialize.hpp>
#include <boost/iostreams/device/back_inserter.hpp>
#include <boost/iostreams/stream_buffer.hpp>
#include <boost/iostreams/stream.hpp>
#include <boost/functional/hash.hpp>
namespace mp_hashing {
namespace io = boost::iostreams;
struct hash_sink {
hash_sink(size_t& seed_ref) : _ptr(&seed_ref) {}
typedef char char_type;
typedef io::sink_tag category;
std::streamsize write(const char* s, std::streamsize n) {
boost::hash_combine(*_ptr, boost::hash_range(s, s+n));
return n;
}
private:
size_t* _ptr;
};
template <typename T> struct hash_impl {
size_t operator()(T const& v) const {
using namespace boost;
size_t seed = 0;
{
iostreams::stream<hash_sink> os(seed);
archive::binary_oarchive oa(os, archive::no_header | archive::no_codecvt);
oa << v;
}
return seed;
}
};
}
#include <unordered_map>
#include <boost/unordered_map.hpp>
namespace std {
template <typename backend>
struct hash<boost::multiprecision::number<backend> >
: mp_hashing::hash_impl<boost::multiprecision::number<backend> >
{};
}
namespace boost {
template <typename backend>
struct hash<multiprecision::number<backend> >
: mp_hashing::hash_impl<multiprecision::number<backend> >
{};
}
template <typename Map>
void test(Map const& map) {
std::cout << "\n" << __PRETTY_FUNCTION__ << "\n";
for(auto& p : map)
std::cout << p.second << "\t" << p.first << "\n";
}
int main() {
using boost::multiprecision::cpp_int;
test(std::unordered_map<cpp_int, std::string> {
{ cpp_int(1) << 111, "one" },
{ cpp_int(2) << 222, "two" },
{ cpp_int(3) << 333, "three" },
});
test(boost::unordered_map<cpp_int, std::string> {
{ cpp_int(1) << 111, "one" },
{ cpp_int(2) << 222, "two" },
{ cpp_int(3) << 333, "three" },
});
}
Prints
void test(const Map&) [with Map = std::unordered_map<boost::multiprecision::number<boost::multiprecision::backends::cpp_int_backend<> >, std::basic_string<char> >]
one 2596148429267413814265248164610048
three 52494017394792286184940053450822912768476066341437098474218494553838871980785022157364316248553291776
two 13479973333575319897333507543509815336818572211270286240551805124608
void test(const Map&) [with Map = boost::unordered::unordered_map<boost::multiprecision::number<boost::multiprecision::backends::cpp_int_backend<> >, std::basic_string<char> >]
three 52494017394792286184940053450822912768476066341437098474218494553838871980785022157364316248553291776
two 13479973333575319897333507543509815336818572211270286240551805124608
one 2596148429267413814265248164610048
As you can see, the difference in implementation between Boost's and the standard library's unordered_map
show up in the different orderings for identical hashes.
Boost, converting between arbitrary precision floating point and integral types
The usual culprit is that the result of BMP expressions are not their target types, but still "lazy" expression templates, and they don't have all the implicit conversions. So, you need to evaluate these "manually" (explicitly):
return Dec(ceil(five*base*mult)).convert_to<Int>();
See it Live On Coliru
#include <iostream>
#include <boost/multiprecision/detail/default_ops.hpp>
#include <boost/multiprecision/cpp_int.hpp>
#include <boost/multiprecision/cpp_dec_float.hpp>
#include <boost/math/special_functions/pow.hpp>
#include <boost/multiprecision/number.hpp>
namespace mp = boost::multiprecision;
using Int = mp::cpp_int;
Int calcReward(Int baseReward, int percent);
int main()
{
Int const immBase = 400;
int const percent = 12;
std::cout << calcReward(immBase, percent);
}
Int calcReward(Int baseReward, int percent)
{
using Dec = mp::cpp_dec_float_50;
Dec mult = 0.1+0.01*percent;
Dec five = 5;
Dec base = baseReward.convert_to<Dec>();
return Dec(ceil(five*base*mult)).convert_to<Int>();
}
Notes:
alternatively, disable expression templates
using Int = mp::number<mp::cpp_int::backend_type, mp::et_off>;
using Dec = mp::number<mp::cpp_dec_float<50>, mp::et_off>;the chrono include(s) were irrelevant
ceil
can be used unqualified due to ADL (What is "Argument-Dependent Lookup" (aka ADL, or "Koenig Lookup")?)
C++: Using pair of (cpp_int, int) integers as key in unordered_map (where cpp_int are boost multiprecision integers)
Yes, you took the Multiprecision hash I contributed earlier and added the hash for std::pair
. I'm not a fan of handrolling the hash combination (good general hash combination is not trivial).
So I'd do the same with boost::hash_combine
:
template <typename K, typename V>
struct hash<std::pair<K, V> >
{
size_t operator()(std::pair<K, V> const& pair) const {
size_t seed = std::hash<K>{}(pair.first);
boost::hash_combine(seed, pair.second);
return seed;
}
};
Live On MSVC RexTester
#include <iostream>
#include <iomanip>
#include <boost/archive/binary_oarchive.hpp>
#include <boost/multiprecision/cpp_int.hpp>
#include <boost/multiprecision/cpp_int/serialize.hpp>
#include <boost/iostreams/device/back_inserter.hpp>
#include <boost/iostreams/stream_buffer.hpp>
#include <boost/iostreams/stream.hpp>
#include <boost/functional/hash.hpp>
namespace mp_hashing {
namespace io = boost::iostreams;
struct hash_sink {
hash_sink(size_t& seed_ref) : _ptr(&seed_ref) {}
typedef char char_type;
typedef io::sink_tag category;
std::streamsize write(const char* s, std::streamsize n) {
boost::hash_combine(*_ptr, boost::hash_range(s, s+n));
return n;
}
private:
size_t* _ptr;
};
template <typename T> struct hash_impl {
size_t operator()(T const& v) const {
using namespace boost;
size_t seed = 0;
{
iostreams::stream<hash_sink> os(seed);
archive::binary_oarchive oa(os, archive::no_header | archive::no_codecvt);
oa << v;
}
return seed;
}
};
}
#include <unordered_map>
#include <boost/unordered_map.hpp>
namespace std {
template <typename backend>
struct hash<boost::multiprecision::number<backend> >
: mp_hashing::hash_impl<boost::multiprecision::number<backend> >
{};
template <typename K, typename V>
struct hash<std::pair<K, V> >
{
size_t operator()(std::pair<K, V> const& pair) const {
size_t seed = std::hash<K>{}(pair.first);
boost::hash_combine(seed, pair.second);
return seed;
}
};
}
int main() {
using boost::multiprecision::cpp_int;
std::unordered_map<std::pair<cpp_int, int>, int> m {
{ { cpp_int(1) << 111, -1 }, 1 },
{ { cpp_int(2) << 222, -2 }, 2 },
{ { cpp_int(3) << 333, -3 }, 3 },
};
for (auto& p : m)
std::cout << p.first.first << " -> " << p.second << "\n";
}
Why I cannot use neither std::unordered_map nor boost::unordered_map with boost::multiprecision types?
Your "question of which this might be a duplicate" documents a working technique in the question itself - hashing the string representation:
#include <boost/multiprecision/cpp_int.hpp>
#include <boost/unordered_map.hpp>
using namespace boost::multiprecision;
template <typename T>
struct hash_str
{
size_t operator()(const T& t) const { return std::hash<std::string>()(t.str()); }
};
int main()
{
cpp_int z(123123123);
cpp_int x(123123123);
boost::unordered_map<cpp_int, cpp_int, hash_str<cpp_int>> data;
data.insert(std::make_pair(z,x));
}
Notes:
I don't know if
cpp_int::str()
outputs the full precision the type stores, but if it doesn't then distinct values yielding the samestr()
and hence hash will collide in the same bucket in the hash table, which won't break the functionality but moves away from O(1) towards O(N) performance. So, if by defaultstr()
doesn't display full precision but there's a way to force it, that would be a good idea if you deal in lots of values differing very slightly.As with all uses of floating point types as keys, be careful as tiny rounding differences can lead to existing map entries not being found/matched, and hence unintended "duplicates" etc..
It your program is too slow and the profile proves the hashing is the cause, then worry about alternatives or upgrading boost....
Generate hash for boost::dynamic_bitset and convert hash back to boost::dynamic_bitset
Ok, I'm detecting much confusion about the essence of "hashing", so a few friendly pointers to get started:
Q. 2. How to convert the hash back to orignial bitset.
That's impossible. A hash is lossy digest. You can only do this if the hash is a Perfect Hash which, due to the laws of entropy, cannot happen if the bitset capacity exceeds the size of size_t
on your platform (typically 32 or 64 bits).
Q. I also tried creating a hash by ...
std::bitset<142> seq (str);
....
I hope you realize that std::bitset<>
is an entirely different type, so it's not really related to the task. And, since it's not dynamic, it's rather unhelpful for the task, even as a workaround.
But Most Importantly:
Hashes are used by hash-tables (like unordered_*<>
) but they are not stored. Hashes are lossy digests, only used to get a good distribution over the internal buckets¹. For actual element equality, std::equal<T>
is still used.
In other words:
typedef boost::bimap<bimaps::unordered_set_of<unsigned long long int>,
bimaps::unordered_multiset_of<size_t> > bimap_reference;
is unsuited for creating a map of anything other than size_t
or unsigned long long
². If you store hashes of things there:
reference_index_vector.insert(position(10000000000, hash1));
, you lose the original information. There's no way to get the bitset from hash1
.
The compiler error
Your hash_value
implementation mistakenly uses private members of dynamic_bitset<>
. You can't because it's not accessible.
Here's a simple implementation of std::hash<>
using the public interface:
Live On Coliru
#include <boost/dynamic_bitset.hpp>
#include <boost/functional/hash.hpp>
#include <unordered_map>
#include <sstream>
namespace std {
template <typename Block, typename Alloc> struct hash<boost::dynamic_bitset<Block, Alloc> > {
size_t operator()(boost::dynamic_bitset<Block, Alloc> const& bs) const {
size_t seed = boost::hash_value(bs.size());
std::vector<Block> blocks(bs.num_blocks());
boost::hash_range(seed, blocks.begin(), blocks.end());
return seed;
}
};
}
int main() {
boost::dynamic_bitset<> x, y;
x.resize(rand()%100, 1);
y.resize(rand()%100, 0);
std::unordered_map<boost::dynamic_bitset<>, std::string> m;
m[x] = "x";
m[y] = "y";
}
You can use this std::hash<>
specialization instead and use boost::bimap
with it.
NOTE, that using the public interface is not optimal, because it copies the Block
s (you also did that with the std::bitset<>
hack). You might be interested in the Boost Serialization implementation I did for boost::dynamic_bitset<>
before:
- How to serialize boost::dynamic_bitset?
- And here's code to show how to use the Serialization implementation to implement the hash efficiently Hash an arbitrary precision value (boost::multiprecision::cpp_int)
¹ Assuming, for simplicity, buckets instead of "open addressing" style. The same logic applies there, but somewhat more complicated
² (by the way, please just say uintmax_t
or uint64_t
)
C++ - Hash value of object graph similar to boost::serialization
I did exactly that once:
- Hash an arbitrary precision value (boost::multiprecision::cpp_int)
Also required reading: Types Don't Know # with the talk on YT: https://www.youtube.com/watch?v=Njjp_MJsgt8
Related Topics
What Is Dynamic Initialization of Object in C++
Reinterpret_Cast VS. C-Style Cast
Why Don't the Std::Fstream Classes Take a Std::String
Inheritance or Composition: Rely on "Is-A" and "Has-A"
How to Read Directly from Physical Memory on Windows
C/C++ Changing the Value of a Const
Error: Variable "Cannot Be Implicitly Captured Because No Default Capture Mode Has Been Specified"
Why Do I Need Double Layer of Indirection for MACros
How Portable Is End Iterator Decrement
How to Deploy a Qt Application on Windows
Is Reading an Indeterminate Value Undefined Behavior
Random Output Different Between Implementations
How to Use Doxygen to Create Uml Class Diagrams from C++ Source