Is There a More Efficient Implementation for a Bidirectional Map

Is there a more efficient implementation for a bidirectional map?

There is a certain problem with double-storing your data in all simple implementations of a bimap. If you can break it down to a bimap of pointers from outside, then you can readily ignore this and simply keep both maps of the form std::map<A*,B*> like Arkaitz Jimenez already suggested (though contrary to his answer you have to care about the storage from outside to avoid a A->A* lookup). But if you have the pointers anyway, why not simply store a std::pair<A,B> at the point where you would otherwise store A and B separately?

It would be nice to have std::map<A,B*> instead of std::map<A*,B*> as this would allow for example the lookup of an element associated to an string by a newly created string with the same content instead of the pointer to the original string that created the pair. But it is customary to store a full copy of the key with every entry and only rely on the hash to find the right bucket. This way the returned item will be the correct one even in the case of a hash-collision...

If you want to have it quick and dirty though, there is this

hackish solution:

Create two maps std::map<size_t, A> mapA and std::map<size_t, B> mapB. Upon insertion hash both elements that are to be inserted to get the keys to the respective maps.

void insert(const A &a, const B &b) {
size_t hashA = std::hash<A>(a);
size_t hashB = std::hash<B>(b);

mapA.insert({hashB, a});
mapB.insert({hashA, b});
}

Lookup is implemented analogously.

Using a multimap instead of a map and verifying every element you get with a lookup in the respectively other map (get candidate b from mapA, hash b and look in mapB if it matches the wanted key, iterate to the next candidate b otherwise) this is a valid implementation - but still hackish in my opinion...

You can get a much nicer solution by using the copies of the elements that are used to compare the entries (see above) as only storage. It is a bit harder to get your head around that though. To elaborate:

a nicer solution:

Create two sets of pairs as std::set<pair<A, B*>> and std::set<pair<B, A*>> and overload the operator< and operator== to only take the first element of the pairs into account (or provide an corresponding comparion class). It is necessary to create sets of pairs instead of maps (which internally look similarly) because we need a guarantee that A and B will be at constant positions in memory. Upon insertion of an pair<A, B> we split it into two elements that fit into the above sets.

  std::set<pair<B, A*>> mapA;
std::set<pair<A, B*>> mapB;

void insert(const A &a, const B &b) {
auto aitr = mapA.insert({b, nullptr}).first; // creates first pair
B *bp = &(aitr->first); // get pointer of our stored copy of b
auto bitr = mapB.insert({a, bp}).first;
// insert second pair {a, pointer_to_b}
A *ap = &(bitr->first); // update pointer in mapA to point to a
aitr->second = ap;
}

Lookup can now simply be done by a simple std::set lookup and a pointer dereference.

This nicer solution is similar to the solution that boost uses - even though they use some annonymized pointers as second elements of the pairs and thus have to use reinterpret_casts.

Note that the .second part of the pairs need to be mutable (so I'm not sure std::pair can be used), or you have to add another layer of abstraction (std::set<pair<B, A**>> mapA) even for this simple insertion. In both solutions you need temporary elements to return non-const references to elements.

Efficient implementation of bidirectional map between 64bit and 32bit unsigned integers

According to table 2.2 in the latest core OpenGL spec a uint in OpenGL should always be 32 bits in width (the spec is about the same for ES). All OpenGL names/handles are as far as I know (and you also say in your question) uint's. So, it should be 32 bit on both your host and target.

Note that it is exactly because the actual bit width of unsigned int might differ between platforms that OpenGL has its own types that should conform to the spec.

Update

If the remaining handles are really only contexts and other window-system calls I'd keep it simple because we're not talking frequent operations, nor a huge amount of handles. These kind of operations are usually not done more than once per OpenGL application per GPU, which is likely 1 on any mobile phone. I think the easiest solution of all would be to use an array. Pseudocode

class context_creator
{
std::array<EGLContext, 1000> context_map; //8KB
public:
context_creator() : context_map{} {}
uint32_t allocate(...) {
for(unsigned i = 0; i < context_map.size(); i++) {
if(!context_map[i]) {
context_map[i] = eglCreateContext(...);
return i;
}
}
}
void deallocate(uint32_t handle) {
eglDeleteContext(context_map[handle]);
context_map[handle] = 0;
}
//Has to be called in every function where a context is a parameter.
EGLContext translate(uint32_t handle) const {
return context_map[handle];
}
}

One note, this won't work if 0 is a valid name for a context. I really don't know for WGL, but probably it's not. The advantage of this is that while the allocate isn't the fastest algorithm ever, the translate is O(1) and that's what most likely will be called most often.

Of course, variations exists:

  • You can use a more dynamic container (e.g. vector) instead of fixed size.
  • You can use a hashtable (like std::map) and just generate a unique index per call. This consumes more memory as you have to store the index also (it's imlicit in an array), but it solves the problem if 0 is a valid context name.

Bi-directional Map in Java?

You can use the Google Collections API for that, recently renamed to Guava, specifically a BiMap

A bimap (or "bidirectional map") is a map that preserves the
uniqueness of its values as well as that of its keys. This constraint
enables bimaps to support an "inverse view", which is another bimap
containing the same entries as this bimap but with reversed keys and
values.

bimap implementation in modern C++ without Boost

I think it's just a lot of tedious work.

For fun, I set out to implement a starting point here.

Live On Coliru

Notes:

  • The 'main' backing is a std::list<tuple<K,V>>. This makes sure that the iterator/reference invalidation semantics are as close to std::map as possible
  • The main backing doubles as the "left" view (which is only available read-only for external use)
  • The "right" view is very similar but uses a std::reference_wrapper<value_type const> so we only index the first container, really

I've only implemented the simple queries (lower_bound, upper_bound, equal_range and find). Of course iterators and ranged-for are there.

You'll have some bits left to do (erase, emplace, range-insertion, initializer_list construction; stateful allocator/comparator support is sketchy (no constructors take the relevant arguments) but scoped allocators have been taken into account.)

Without further ado:

#include <algorithm>
#include <tuple>
#include <list>
#include <functional> // for std::reference_wrapper
#include <cassert>

namespace bimap {
namespace detail {
template <typename Cmp, typename V, size_t I> struct CompareByElement : private Cmp {
bool operator()(V const& a, V const& b) const {
using std::get;
return Cmp::operator()(get<I>(a), get<I>(b));
}
bool operator()(V const& a, V const& b) {
using std::get;
return Cmp::operator()(get<I>(a), get<I>(b));
}
};

namespace tags {
struct left_view;
struct right_view;
}

template <typename ViewTag, typename Left, typename Right, typename LeftCmp, typename RightCmp, typename RawAlloc>
struct view_traits;

template <typename Left, typename Right, typename LeftCmp, typename RightCmp, typename RawAlloc>
struct view_traits<tags::left_view, Left, Right, LeftCmp, RightCmp, RawAlloc> {
using value_type = std::tuple<Left, Right>;
using allocator_type = typename RawAlloc::template rebind<value_type>::other;
using base_type = std::list<value_type, allocator_type>;
using comparator = CompareByElement<LeftCmp, value_type, 0>;
};

template <typename Left, typename Right, typename LeftCmp, typename RightCmp, typename RawAlloc>
struct view_traits<tags::right_view, Left, Right, LeftCmp, RightCmp, RawAlloc> {
using value_type = std::tuple<Left, Right>;
using allocator_type = typename RawAlloc::template rebind<std::reference_wrapper<value_type const>>::other;
using base_type = std::list<std::reference_wrapper<value_type const>, allocator_type>;
using comparator = CompareByElement<RightCmp, value_type, 1>;
};

template <typename Left, typename Right, typename LeftCmp, typename RightCmp, typename RawAlloc>
struct bimap_traits {
using left_traits = view_traits<tags::left_view, Left, Right, LeftCmp, RightCmp, RawAlloc>;
using right_traits = view_traits<tags::right_view, Left, Right, LeftCmp, RightCmp, RawAlloc>;
};

template <typename Traits> struct map_adaptor :
private Traits::base_type,
private Traits::comparator // empty base class optimization
{
using value_type = typename Traits::value_type;
using allocator_type = typename Traits::allocator_type;
using base_type = typename Traits::base_type;
using comparator = typename Traits::comparator;

using iterator = typename base_type::iterator;
using const_iterator = typename base_type::const_iterator;

using base_type::cbegin;
using base_type::cend;
using base_type::begin;
using base_type::end;
using base_type::insert;
using base_type::size;

auto lower_bound(value_type const& v) { return std::lower_bound(base_type::begin(), base_type::end(), v, get_comp()); }
auto upper_bound(value_type const& v) { return std::upper_bound(base_type::begin(), base_type::end(), v, get_comp()); }
auto equal_range(value_type const& v) { return std::equal_range(base_type::begin(), base_type::end(), v, get_comp()); }
auto lower_bound(value_type const& v) const { return std::lower_bound(base_type::begin(), base_type::end(), v, get_comp()); }
auto upper_bound(value_type const& v) const { return std::upper_bound(base_type::begin(), base_type::end(), v, get_comp()); }
auto equal_range(value_type const& v) const { return std::equal_range(base_type::begin(), base_type::end(), v, get_comp()); }

auto find(value_type const& v) {
auto er = equal_range(v);
return er.first == er.second? end() : er.first;
}

auto find(value_type const& v) const {
auto er = equal_range(v);
return er.first == er.second? end() : er.first;
}

base_type& base() { return *static_cast<base_type*>(this); }
base_type const & base() const { return *static_cast<base_type const*>(this); }
private:
comparator& get_comp() { return *this; }
comparator const& get_comp() const { return *this; }
};
}

template <typename Left, typename Right,
typename LeftCmp = std::less<Left>, typename RightCmp = std::less<Right>,
typename RawAlloc = std::allocator<void>,
typename Traits = detail::bimap_traits<Left, Right, LeftCmp, RightCmp, RawAlloc>
>
class bimap : private detail::map_adaptor<typename Traits::left_traits> {
public:
using left_type = typename detail::map_adaptor<typename Traits::left_traits>;
using right_type = typename detail::map_adaptor<typename Traits::right_traits>;

using value_type = typename Traits::left_traits::value_type;
using allocator_type = typename Traits::left_traits::allocator_type;
using base_type = left_type;

using const_iterator = typename base_type::const_iterator;
using iterator = const_iterator;

using base_type::cbegin;
using base_type::cend;
auto begin() const { return cbegin(); }
auto end() const { return cend(); }
using base_type::size;

left_type const& left() const { return *this; }
right_type const& right() const { return inverse_index; }

std::pair<const_iterator, bool> insert(value_type const& v) {
auto lr = left().find(v);
auto rr = right().find(v);

bool hasl = lr!=left().end(),
hasr = rr!=right().end();

if (!hasl && !hasr) {
auto lins = mutable_left().insert(left().lower_bound(v), v);
auto rins = mutable_right().insert(right().lower_bound(*lins), *lins);
(void) rins;

return { lins, true };
} else {
return { end(), false };
}
}

private:
detail::map_adaptor<typename Traits::right_traits> inverse_index;
left_type& mutable_left() { return *this; }
right_type& mutable_right() { return inverse_index; }
};
}

#include <iostream>

#define CHECK(cond) do {\
if (cond) { } else { std::cout << "FAILED: " #cond "\n"; } } while(false)

int main() {
using Map = bimap::bimap<int, std::string>;
Map bm;
CHECK(bm.insert(std::make_tuple(1,"three")).second);

// now left 1 and right "three" are taken:
CHECK(!bm.insert(std::make_tuple(1,"two")).second);
CHECK(!bm.insert(std::make_tuple(2,"three")).second);

// unrelated keys insert fine
CHECK(bm.insert(std::make_tuple(2,"aaaa")).second);

// thing contains 2 elements now:
CHECK(bm.size() == 2);

using std::get;

for (Map::value_type const& p : bm) std::cout << get<0>(p) << ", " << get<1>(p) << "; "; std::cout << "\n";
for (Map::value_type const& p : bm.left()) std::cout << get<0>(p) << ", " << get<1>(p) << "; "; std::cout << "\n";

// right view map orders by right index
for (Map::value_type const& p : bm.right()) std::cout << get<0>(p) << ", " << get<1>(p) << "; "; std::cout << "\n";

// you can do find, lower_bound, equal_range etc. on both sides
}

Prints:

1, three; 2, aaaa; 
1, three; 2, aaaa;
2, aaaa; 1, three;

Is it more efficient to have a map of maps, or a very large map?

Doing the arithmetic, if you have n elements lookup time would be O(log(n)). If you split the map into m1 maps containing maps of size m2, then you need O(log(m1)) to lookup in the first map, then O(log(m2)) to do a lookup on the second map. But since

log(m1) + log(m2) = log(m1*m2) = log(n)

You aren't buying much either way and you should probably code it whatever way best reveals your intentions.

As far as the actual performance, you should profile it and see if it makes a difference. It's possible that the two maps will be faster since each map will have a simpler comparator functions (they will each do one string comparison while the comparator for the monolithic map will have to do two for some pairs objects). Also, if in your application you are likely to be not finding a match on the first string for a lot of your lookups then you will doing just first lookup which might save some time.

On the other hand, if the sizes of the secondary maps are not uniform, then you could be effectively be doing two lookups. As a worst-case consider that half the keys have identical first strings, and the other keys all have district strings.



Related Topics



Leave a reply



Submit