Best implementation of dual-side dict in Python
You could use bidict package which provides a bidirectional map. The syntax looks as follows (taken from the documentation):
>>> from bidict import bidict
>>> element_by_symbol = bidict(H='hydrogen')
>>> element_by_symbol
bidict({'H': 'hydrogen'})
>>> element_by_symbol['H']
'hydrogen'
>>> element_by_symbol.inv
bidict({'hydrogen': 'H'})
>>> element_by_symbol.inv['hydrogen']
'H'
>>> element_by_symbol.inv.inv is element_by_symbol
True
Or you can implement it yourself, for example using one of the solutions provided here.
Is there a more efficient implementation for a bidirectional map?
There is a certain problem with double-storing your data in all simple implementations of a bimap. If you can break it down to a bimap of pointers from outside, then you can readily ignore this and simply keep both maps of the form std::map<A*,B*>
like Arkaitz Jimenez already suggested (though contrary to his answer you have to care about the storage from outside to avoid a A->A*
lookup). But if you have the pointers anyway, why not simply store a std::pair<A,B>
at the point where you would otherwise store A
and B
separately?
It would be nice to have std::map<A,B*>
instead of std::map<A*,B*>
as this would allow for example the lookup of an element associated to an string by a newly created string with the same content instead of the pointer to the original string that created the pair. But it is customary to store a full copy of the key with every entry and only rely on the hash to find the right bucket. This way the returned item will be the correct one even in the case of a hash-collision...
If you want to have it quick and dirty though, there is this
hackish solution:
Create two maps
std::map<size_t, A> mapA
andstd::map<size_t, B> mapB
. Upon insertion hash both elements that are to be inserted to get the keys to the respective maps.void insert(const A &a, const B &b) {
size_t hashA = std::hash<A>(a);
size_t hashB = std::hash<B>(b);
mapA.insert({hashB, a});
mapB.insert({hashA, b});
}
Lookup is implemented analogously.
Using a multimap
instead of a map
and verifying every element you get with a lookup in the respectively other map (get candidate b
from mapA
, hash b
and look in mapB
if it matches the wanted key, iterate to the next candidate b otherwise) this is a valid implementation - but still hackish in my opinion...
You can get a much nicer solution by using the copies of the elements that are used to compare the entries (see above) as only storage. It is a bit harder to get your head around that though. To elaborate:
a nicer solution:
Create two sets of pairs as
std::set<pair<A, B*>>
andstd::set<pair<B, A*>>
and overload theoperator<
andoperator==
to only take the first element of the pairs into account (or provide an corresponding comparion class). It is necessary to create sets of pairs instead of maps (which internally look similarly) because we need a guarantee thatA
andB
will be at constant positions in memory. Upon insertion of anpair<A, B>
we split it into two elements that fit into the above sets.
std::set<pair<B, A*>> mapA;
std::set<pair<A, B*>> mapB;
void insert(const A &a, const B &b) {
auto aitr = mapA.insert({b, nullptr}).first; // creates first pair
B *bp = &(aitr->first); // get pointer of our stored copy of b
auto bitr = mapB.insert({a, bp}).first;
// insert second pair {a, pointer_to_b}
A *ap = &(bitr->first); // update pointer in mapA to point to a
aitr->second = ap;
}
Lookup can now simply be done by a simple
std::set
lookup and a pointer dereference.
This nicer solution is similar to the solution that boost uses - even though they use some annonymized pointers as second elements of the pairs and thus have to use reinterpret_cast
s.
Note that the .second
part of the pairs need to be mutable (so I'm not sure std::pair
can be used), or you have to add another layer of abstraction (std::set<pair<B, A**>> mapA
) even for this simple insertion. In both solutions you need temporary elements to return non-const references to elements.
Two way/reverse map
You can create your own dictionary type by subclassing dict
and adding the logic that you want. Here's a basic example:
class TwoWayDict(dict):
def __setitem__(self, key, value):
# Remove any previous connections with these values
if key in self:
del self[key]
if value in self:
del self[value]
dict.__setitem__(self, key, value)
dict.__setitem__(self, value, key)
def __delitem__(self, key):
dict.__delitem__(self, self[key])
dict.__delitem__(self, key)
def __len__(self):
"""Returns the number of connections"""
return dict.__len__(self) // 2
And it works like so:
>>> d = TwoWayDict()
>>> d['foo'] = 'bar'
>>> d['foo']
'bar'
>>> d['bar']
'foo'
>>> len(d)
1
>>> del d['foo']
>>> d['bar']
Traceback (most recent call last):
File "<stdin>", line 7, in <module>
KeyError: 'bar'
I'm sure I didn't cover all the cases, but that should get you started.
Two-way hash table
One name for this is a BiMap (as in bidirectional). The obvious limitation is that keys will be distinct (like in a normal dictionary/map), but so will with values.
For Java, there's a StackOverflow question on it, but the general recommendation is the Guava BiMap.
For C and C++, Boost has a Bimap.
Internally, it's the "inefficient" implementation you mention where it keeps two hashtables. Here's the thing: it is efficient, and using twice as much memory for a secondary lookup structure is expected, and rarely a big deal.
Python bidirectional mapping
If you want to use two dicts, you can try this to create the inverted dict:
b = {v: k for k, v in a.iteritems()}
build a dictionary for find key by value
You could try bidict:
>>> husbands2wives = bidict({'john': 'jackie'})
>>> husbands2wives['john'] # the forward mapping is just like with dict
'jackie'
>>> husbands2wives[:'jackie'] # use slice for the inverse mapping
'john'
Is there a better way to store a twoway dictionary than storing its inverse separate?
What I've done in the past is created a reversedict
function, which would take a dict and return the opposite mapping, either values to keys if I knew it was one-to-one (throwing exceptions on seeing the same value twice), or values to lists of keys if it wasn't. That way, instead of having to construct two dicts at the same time each time I wanted the inverse look-up, I could create my dicts as normal and just call the generic reversedict
function at the end.
However, it seems that the bidict solution that Jon mentioned in the comments is probably the better one. (My reversedict
function seems to be his bidict's ~
operator).
Related Topics
How to Lowercase a String in Python
Directory-Tree Listing in Python
How to Use Stringio in Python3
Perform Commands Over Ssh with Python
Execute Code When Django Starts Once Only
Replace Console Output in Python
Why Do You Need Explicitly Have the "Self" Argument in a Python Method
Using Os.Walk() to Recursively Traverse Directories in Python
"Python" Not Recognized as a Command
How to Create a List of Lambdas (In a List Comprehension/For Loop)
Setting Smaller Buffer Size for Sys.Stdin
Removing Item from List - During Iteration - What's Wrong with This Idiom