What Would a "Frozen Dict" Be

What would a frozen dict be?

Python doesn't have a builtin frozendict type. It turns out this wouldn't be useful too often (though it would still probably be useful more often than frozenset is).

The most common reason to want such a type is when memoizing function calls for functions with unknown arguments. The most common solution to store a hashable equivalent of a dict (where the values are hashable) is something like tuple(sorted(kwargs.items())).

This depends on the sorting not being a bit insane. Python cannot positively promise sorting will result in something reasonable here. (But it can't promise much else, so don't sweat it too much.)


You could easily enough make some sort of wrapper that works much like a dict. It might look something like

import collections

class FrozenDict(collections.Mapping):
"""Don't forget the docstrings!!"""

def __init__(self, *args, **kwargs):
self._d = dict(*args, **kwargs)
self._hash = None

def __iter__(self):
return iter(self._d)

def __len__(self):
return len(self._d)

def __getitem__(self, key):
return self._d[key]

def __hash__(self):
# It would have been simpler and maybe more obvious to
# use hash(tuple(sorted(self._d.iteritems()))) from this discussion
# so far, but this solution is O(n). I don't know what kind of
# n we are going to run into, but sometimes it's hard to resist the
# urge to optimize when it will gain improved algorithmic performance.
if self._hash is None:
hash_ = 0
for pair in self.items():
hash_ ^= hash(pair)
self._hash = hash_
return self._hash

It should work great:

>>> x = FrozenDict(a=1, b=2)
>>> y = FrozenDict(a=1, b=2)
>>> x is y
False
>>> x == y
True
>>> x == {'a': 1, 'b': 2}
True
>>> d = {x: 'foo'}
>>> d[y]
'foo'

A workaround for Python's missing frozen-dict type?

EDIT:

Starting from Python 3.6, dictionaries preserve the insertion order. Therefore, the workaround would vary depending on the used Python version.

For Python < 3.6 (Dictionaries do not preserve insertion order) - use frozenset, so that two sets are equal even if order is different:

>>> a = {'key1' : 'val1', 'key2' : 'val2'}
>>> b = frozenset(a.items())
>>> frozenset_restored_to_dict = dict(b)
>>> frozenset_restored_to_dict
{'key2': 'val2', 'key1': 'val1'}

Otherwise (Dictionaries preserve insertion order), use tuple. This way, the dictionary can be restored while preserving the order of items, yet, tuples with same items ordered differently will not be equal. A workaround for it would be passing the tuples to a frozenset constructor each time before a comparison is made.

>>> a = {'key1' : 'val1', 'key2' : 'val2'}
>>> b = tuple(a.items())
>>> tuple_restored_to_dict = dict(b)
>>> tuple_restored_to_dict
{'key1': 'val1', 'key2': 'val2'}

As can be seen in the code, b is a tuple, or a frozenset. Either are immutable and hashable, and can be totally restored to be a regular dictionary like a.

Is there a dictionary-like object that is immutable?

I mentioned it in the comments, that I'm not sure why this is needed.

But one could simply override __setitem__ of a dictionary class. Alltho this might (most likely) cause problems down the line. A minimal example of this would be:

class autodict(dict):
def __init__(self, *args, **kwargs):
super(autodict, self).__init__(*args, **kwargs)

def __getitem__(self, key):
val = dict.__getitem__(self, key)
return val

def __setitem__(self, key, val):
pass

x = autodict({'a' : 1, 'b' : 2})
x['c'] = 3
print(x)

Which will produce {'a': 1, 'b': 2} and thus ignoring the x['c'] = 3 set.


Some benefits

The speed difference is some where between 40-1000 times faster using dictionary inheritance compared to named tuples. (See below for crude speed tests)

The in operator works on dictionaries, not so well on named tuples when used like this:

'a' in nt == False
'a' in x == True

You can use key access dictionary style instead of (for lack of a better term) JavaScript style

x['a'] == nt.a

Although that's a matter of taste.

You also don't have to be picky about keys, since dictionaries support essentially any key identifier:

x[1] = 'a number'
nt = foo({1 : 'a number'})

Named tuples will result in Type names and field names must be valid identifiers: '1'


Optimizations (timing the thing)

Now, this is a crude example, and it would vary a lot depending on the system, the place of the moon in the sky etc.. But as a crude example:

import time
from collections import namedtuple

class autodict(dict):
def __init__(self, *args, **kwargs):
super(autodict, self).__init__(*args, **kwargs)
#self.update(*args, **kwargs)

def __getitem__(self, key):
val = dict.__getitem__(self, key)
return val

def __setitem__(self, key, val):
pass

def __type__(self, *args, **kwargs):
return dict

def foo(bar):
MyNamedTuple = namedtuple("MyNamedTuple", [k for k in bar.keys()])
d = {k: v for k, v in bar.items()}
return MyNamedTuple(**d)

start = time.time()
for i in range(1000000):
nt = foo({'x'+str(i) : i})
end = time.time()
print('Named tuples:', end - start,'seconds.')

start = time.time()
for i in range(1000000):
x = autodict({'x'+str(i) : i})
end = time.time()
print('Autodict:', end - start,'seconds.')

Results in:

Named tuples: 59.21987843513489 seconds.
Autodict: 1.4844810962677002 seconds.

The dictionary setup is in my book, insanely quicker. Although that most likely has to do with multiple for loops in the named tuple setup, and that can probably be easily remedied some how. But for basic understanding this is a big difference. The example obviously doesn't test larger one-time-creations or access times. Just, "what if you use these options to create data-sets over a period of time, how much time would you loose" :)

Bonus: What if you have a large base dictionary, and want to freeze it?

base_dict = {'x'+str(i) : i for i in range(1000000)}

start = time.time()
nt = foo(base_dict)
end = time.time()
print('Named tuples:', end - start,'seconds.')

start = time.time()
x = autodict(base_dict)
end = time.time()
print('Autodict:', end - start,'seconds.')

Well, the difference was bigger than I expected.. x1038.5 times faster.

(I was using the CPU for other stuff, but I think this is fair game)

Named tuples: 154.0662612915039 seconds.
Autodict: 0.1483476161956787 seconds.

Immutable/Frozen Dictionary subclass with fixed key, value types

This seem to work just fine for me (tested with Python 3.6 and 3.8):

from collections import UserDict
from typing import Mapping

class WorkflowParams(UserDict):
def __init__(self, __dict: Mapping[str, str]) -> None:
super().__init__()
for key, value in __dict.items():
super().__setitem__(key, value)

def __setitem__(self, key: str, item: str) -> None:
raise AttributeError("WorkflowParams is immutable.")

def __delitem__(self, key: str) -> None:
raise AttributeError("WorkflowParams is immutable.")

workflow_parameters = WorkflowParams(
{
"s3-bucket": "my-big-bucket",
"input-val": "1",
}
)

print(workflow_parameters)
# output: {'s3-bucket': 'my-big-bucket', 'input-val': '1'}

workflow_parameters['test'] = 'dummy'
# expected exception: AttributeError: WorkflowParams is immutable.

Is it safe to use frozen set as Dict key?

are there cases where two sets of same elements happen to add two entries in Dict?

No. frozenset hashing algorithm doesn't depend on the order of the elements, only on elements themselves. Two FS'es with the same elements are equal and have equal hashes, thus satisfying both criteria for "dict identity", in other words, they are the same dict key:

>>> a = frozenset([1,1,1,1,2,3])
>>> b = frozenset([3,3,3,3,2,1])
>>> {a:1, b:2}
{frozenset([1, 2, 3]): 2}

Python: Freeze dict keys after creation

Maybe something like this:

class FreezableDict (dict):
__frozen = False

def freeze (self):
self.__frozen = True

def __setitem__ (self, key, value):
if self.__frozen and key not in self:
raise ValueError('Dictionary is frozen')
super().__setitem__(key, value)
>>> x = FreezableDict({'foo': 'bar', 'baz': 'bla'})
>>> x
{'baz': 'bla', 'foo': 'bar'}
>>> x['asdf'] = 'fdsa'
>>> x
{'asdf': 'fdsa', 'baz': 'bla', 'foo': 'bar'}
>>> x.freeze()
>>> x['hello'] = 'world'
Traceback (most recent call last):
File "<pyshell#20>", line 1, in <module>
x['hello'] = 'world'
File "<pyshell#13>", line 8, in __setitem__
raise ValueError('Dictionary is frozen')
ValueError: Dictionary is frozen

Note that you might want to overwrite other methods too, including __delitem__, update, setdefault, pop, and popitem, as they can all modify the dictionary.


If you are interested in locking the dictionary completely, you could use types.MappingProxyType which provides a read-only view onto your dictionary. Once you have created your normal dictionary, you can then just create a mapping proxy of it which simply does not have any of the assignment/update functionality. You can also then get rid of any reference to the original dictionary (the mapping will keep one), to prevent it from being used to update it any further:

>>> x = {'foo': 'bar'}
>>> y = types.MappingProxyType(x)
>>> y
mappingproxy({'foo': 'bar'})
>>> x['baz'] = 'bla'
>>> y
mappingproxy({'baz': 'bla', 'foo': 'bar'})
>>> y['hello'] = 'world'
Traceback (most recent call last):
File "<pyshell#55>", line 1, in <module>
y['hello'] = 'world'
TypeError: 'mappingproxy' object does not support item assignment
>>> del x
>>> y
mappingproxy({'baz': 'bla', 'foo': 'bar'})

Or just in a single line, without ever having a reference to the original dictionary:

>>> x = types.MappingProxyType({'foo': 'bar', 'baz': 'bla'})
>>> x
mappingproxy({'baz': 'bla', 'foo': 'bar'})
>>> x['hello'] = 'world'
Traceback (most recent call last):
File "<pyshell#60>", line 1, in <module>
x['hello'] = 'world'
TypeError: 'mappingproxy' object does not support item assignment

What would a frozen dict be?

Python doesn't have a builtin frozendict type. It turns out this wouldn't be useful too often (though it would still probably be useful more often than frozenset is).

The most common reason to want such a type is when memoizing function calls for functions with unknown arguments. The most common solution to store a hashable equivalent of a dict (where the values are hashable) is something like tuple(sorted(kwargs.items())).

This depends on the sorting not being a bit insane. Python cannot positively promise sorting will result in something reasonable here. (But it can't promise much else, so don't sweat it too much.)


You could easily enough make some sort of wrapper that works much like a dict. It might look something like

import collections

class FrozenDict(collections.Mapping):
"""Don't forget the docstrings!!"""

def __init__(self, *args, **kwargs):
self._d = dict(*args, **kwargs)
self._hash = None

def __iter__(self):
return iter(self._d)

def __len__(self):
return len(self._d)

def __getitem__(self, key):
return self._d[key]

def __hash__(self):
# It would have been simpler and maybe more obvious to
# use hash(tuple(sorted(self._d.iteritems()))) from this discussion
# so far, but this solution is O(n). I don't know what kind of
# n we are going to run into, but sometimes it's hard to resist the
# urge to optimize when it will gain improved algorithmic performance.
if self._hash is None:
hash_ = 0
for pair in self.items():
hash_ ^= hash(pair)
self._hash = hash_
return self._hash

It should work great:

>>> x = FrozenDict(a=1, b=2)
>>> y = FrozenDict(a=1, b=2)
>>> x is y
False
>>> x == y
True
>>> x == {'a': 1, 'b': 2}
True
>>> d = {x: 'foo'}
>>> d[y]
'foo'

How can I convert a dictionary that has frozensets as keys and integers as values into a list of dictionaries?

You can't have lists as dictionary keys as they are a mutable type and only immutable types can be used as dict keys.

If you'd like to know more, find out: why dict keys must be immutable.

Convert a frozenset to a dictionary in python

You can use dict comprehension with enumerate:

f_set = [frozenset({8, 14, 15, 18}), frozenset({1, 2, 3, 7, 8}), frozenset({0, 4, 5})]

dict1 = {x: i for i, s in enumerate(f_set) for x in s}
print(dict1)
# {8: 1, 18: 0, 14: 0, 15: 0, 1: 1, 2: 1, 3: 1, 7: 1, 0: 2, 4: 2, 5: 2}

Note that, if the sets are not mutually disjoint, some keys will be discarded, since a dict cannot have duplicate keys.



Related Topics



Leave a reply



Submit