defaultdict of defaultdict?
Yes like this:
defaultdict(lambda: defaultdict(int))
The argument of a defaultdict
(in this case is lambda: defaultdict(int)
) will be called when you try to access a key that doesn't exist. The return value of it will be set as the new value of this key, which means in our case the value of d[Key_doesnt_exist]
will be defaultdict(int)
.
If you try to access a key from this last defaultdict i.e. d[Key_doesnt_exist][Key_doesnt_exist]
it will return 0, which is the return value of the argument of the last defaultdict i.e. int()
.
Nested defaultdict of defaultdict
For an arbitrary number of levels:
def rec_dd():
return defaultdict(rec_dd)
>>> x = rec_dd()
>>> x['a']['b']['c']['d']
defaultdict(<function rec_dd at 0x7f0dcef81500>, {})
>>> print json.dumps(x)
{"a": {"b": {"c": {"d": {}}}}}
Of course you could also do this with a lambda, but I find lambdas to be less readable. In any case it would look like this:
rec_dd = lambda: defaultdict(rec_dd)
defaultdict of defaultdict of int
Try this:
from collections import defaultdict
m = defaultdict(lambda: defaultdict(int))
m['a']['b'] += 1
Note if you want more depth, you can still use a recursive approach:
def ddict(some_type, depth=0):
if depth == 0:
return defaultdict(some_type)
else:
return defaultdict(lambda: ddict(some_type, depth-1))
m = ddict(int, depth=2)
m['a']['b']['c'] += 1
How does collections.defaultdict work?
Usually, a Python dictionary throws a KeyError
if you try to get an item with a key that is not currently in the dictionary. The defaultdict
in contrast will simply create any items that you try to access (provided of course they do not exist yet). To create such a "default" item, it calls the function object that you pass to the constructor (more precisely, it's an arbitrary "callable" object, which includes function and type objects). For the first example, default items are created using int()
, which will return the integer object 0
. For the second example, default items are created using list()
, which returns a new empty list object.
How to default a nested defaultdict() to a list of specified length?
You can specify that you want a default value of a defaultdict
that has a default value of [0, 0, 0]
from collections import defaultdict
dic01 = defaultdict(lambda: defaultdict(lambda: [0, 0, 0]))
dic01['year2018']['jul_to_sep_temperature'][0] = 25
print(dic01)
prints
defaultdict(<function <lambda> at 0x7f4dc28ac598>, {'year2018': defaultdict(<function <lambda>.<locals>.<lambda> at 0x7f4dc28ac510>, {'jul_to_sep_temperature': [25, 0, 0]})})
Which you can treat as a regular nested dictionary
Nested `defaultdict of defaultdict of defaultdict` each with a backreference
Solution 1
Giving the same {BACKREF: node}
to defaultdict
:
from collections import defaultdict
BACKREF, END = 'BACKREF', '$'
words = ['hi', 'hello', 'hiya', 'hey']
tree = lambda: defaultdict(tree, {BACKREF: node})
node = None
root = tree()
for word in words:
node = root
for ch in word:
node = node[ch]
node[END] = None
The root
node has a backref None
, can be deleted if bothersome.
Solution 2
The above works fine if that code is the only code that creates tree nodes (seems likely to me, judging by the times I've built such trees myself). Otherwise you'd need to ensure that node
points to the correct parent node. If that's an issue, here's an alternative with a dict (not defaultdict) subclass that implements __missing__
to automatically create children with backrefs when needed:
BACKREF, END = 'BACKREF', '$'
words = ['hi', 'hello', 'hiya', 'hey']
class Tree(dict):
def __missing__(self, key):
child = self[key] = Tree({BACKREF: self})
return child
root = Tree()
for word in words:
node = root
for ch in word:
node = node[ch]
node[END] = None
Also doesn't give the root a backref, and being a dict, its string representations are far less cluttered than a defaultdict's and thus far more readable:
>>> import pprint
>>> pprint.pp(root)
{'h': {'BACKREF': <Recursion on Tree with id=2494556270320>,
'i': {'BACKREF': <Recursion on Tree with id=2494556270400>,
'$': None,
'y': {'BACKREF': <Recursion on Tree with id=2494556270480>,
'a': {'BACKREF': <Recursion on Tree with id=2494556340608>,
'$': None}}},
'e': {'BACKREF': <Recursion on Tree with id=2494556270400>,
'l': {'BACKREF': <Recursion on Tree with id=2494556340288>,
'l': {'BACKREF': <Recursion on Tree with id=2494556340368>,
'o': {'BACKREF': <Recursion on Tree with id=2494556340448>,
'$': None}}},
'y': {'BACKREF': <Recursion on Tree with id=2494556340288>,
'$': None}}}}
The defaultdict result for comparison:
>>> pprint.pp(root)
defaultdict(<function <lambda> at 0x000001A13760BE50>,
{'BACKREF': None,
'h': defaultdict(<function <lambda> at 0x000001A13760BE50>,
{'BACKREF': <Recursion on defaultdict with id=1791930855152>,
'i': defaultdict(<function <lambda> at 0x000001A13760BE50>,
{'BACKREF': <Recursion on defaultdict with id=1791930855312>,
'$': None,
'y': defaultdict(<function <lambda> at 0x000001A13760BE50>,
{'BACKREF': <Recursion on defaultdict with id=1791930912832>,
'a': defaultdict(<function <lambda> at 0x000001A13760BE50>,
{'BACKREF': <Recursion on defaultdict with id=1791930913232>,
'$': None})})}),
'e': defaultdict(<function <lambda> at 0x000001A13760BE50>,
{'BACKREF': <Recursion on defaultdict with id=1791930855312>,
'l': defaultdict(<function <lambda> at 0x000001A13760BE50>,
{'BACKREF': <Recursion on defaultdict with id=1791930912912>,
'l': defaultdict(<function <lambda> at 0x000001A13760BE50>,
{'BACKREF': <Recursion on defaultdict with id=1791930912992>,
'o': defaultdict(<function <lambda> at 0x000001A13760BE50>,
{'BACKREF': <Recursion on defaultdict with id=1791930913072>,
'$': None})})}),
'y': defaultdict(<function <lambda> at 0x000001A13760BE50>,
{'BACKREF': <Recursion on defaultdict with id=1791930912912>,
'$': None})})})})
Python Defaultdict with defined dictionary as default value shares the same dictionary between keys
To fix this, you simply need to move your counter_format
dictionary construction into the lambda, so that a new counter_format
dictionary is created each time you try to access a missing value in your innermost defaultdict
from collections import defaultdict
userDict = defaultdict(lambda:
defaultdict(lambda: {"correct": 0, "incorrect": 0})
)
userDict['a'][1]['correct']+=1
userDict['b'][1]['correct']+=4
userDict['b'][2]['correct']+=1
print(userdict)
defaultdict(<function <lambda> at 0x7f0f6cb38550>,
{'a': defaultdict(<function <lambda>.<locals>.<lambda> at 0x7f0f679424c0>,
{1: {'correct': 1, 'incorrect': 0}}),
'b': defaultdict(<function <lambda>.<locals>.<lambda> at 0x7f0f4caaa430>,
{1: {'correct': 4, 'incorrect': 0},
2: {'correct': 1, 'incorrect': 0}})})
defaultdict with default value 1?
Short answer (as per Montaro's answer below)
defaultdict(lambda:1)
Long answer on how defaultdict
s work
ht = {}
ht = defaultdict(lambda:0, ht)
defaultdict
s are different from dict
in that when you try to access a regular dict
with a key that does not exists, it raises a KeyError
.
defaultdict
, however, doesn't raise an error: it creates the key for you instead. With which value? With the return of the callable
you passed as an argument. In this case, every new keys will be created with value 0
(which is the return of the simple lambda
function lambda:0
), which also happens to be the same return of int()
, so in this case, there would be no difference in changing the default function to int()
.
Breaking down this line in more detail: ht = defaultdict(lambda:0, ht)
The first argument is a function, which is a callable object. This is the function that will be called to create a new value for an inexistent key. The second argument, ht
is optional and refers to the base dictionary that the new defaultdict
will be built on. Therefore, if ht
had some keys and values, the defaultdict
would also have these keys with the corresponding values. If you tried to access these keys, you would get the old values.
However, if you did not pass the base dictionary, a brand new defaultdict
would be created, and thus, all new keys accessed would get the default value returned from the callable.
(In this case, as ht
is initially an empty dict
, there would be no difference at all in doing ht = defaultdict(lambda:0)
, ht = defaultdict(int)
or ht = defaultdict(lambda:0, ht)
: they would all build the same defaultdict
.
How to convert defaultdict of defaultdicts [of defaultdicts] to dict of dicts [of dicts]?
You can recurse over the tree, replacing each defaultdict
instance with a dict produced by a dict comprehension:
def default_to_regular(d):
if isinstance(d, defaultdict):
d = {k: default_to_regular(v) for k, v in d.items()}
return d
Demo:
>>> from collections import defaultdict
>>> factory = lambda: defaultdict(factory)
>>> defdict = factory()
>>> defdict['one']['two']['three']['four'] = 5
>>> defdict
defaultdict(<function <lambda> at 0x103098ed8>, {'one': defaultdict(<function <lambda> at 0x103098ed8>, {'two': defaultdict(<function <lambda> at 0x103098ed8>, {'three': defaultdict(<function <lambda> at 0x103098ed8>, {'four': 5})})})})
>>> default_to_regular(defdict)
{'one': {'two': {'three': {'four': 5}}}}
Meaning of defaultdict(lambda: defaultdict(dict))
Let's resolve it from the inside out. Firstly, dict
is the dictionary type. Like other types, calling it creates an instance (also known as object) of that type. A defaultdict
is a type that takes a callable parameter: something that, when called, produces an item to put in the dictionary. This happens when an entry is accessed that was not present, instead of producing a KeyError
like an ordinary dict
. Thirdly, lambda
is a way to create unnamed functions based on a single expression, so these two are similar (the second holds a function that knows its own name, the first doesn't):
y = lambda: defaultdict(dict)
def y():
return defaultdict(dict)
And finally the whole thing is wrapped in another defaultdict
. So the result is that x
is a defaultdict
that produces defaultdict
s that produce dict
instances. At the third level there aren't defaults anymore.
Related Topics
Python Attributeerror: 'Module' Object Has No Attribute 'Serial'
Shuffle an Array with Python, Randomize Array Item Order with Python
How to Extract Top-Level Domain Name (Tld) from Url
Django Return Redirect() with Parameters
Which Is Faster in Python: X**.5 or Math.Sqrt(X)
Python Multiprocessing: Handling Child Errors in Parent
Debugging (Displaying) SQL Command Sent to the Db by SQLalchemy
How to Access the Previous/Next Element in a for Loop
Why Are Str.Count('') and Len(Str) Giving Different Output
Pip - Fatal Error in Launcher: Unable to Create Process Using '"'
Python Pandas: How to Specify Data Types When Reading an Excel File
Fastest Way to Take a Screenshot with Python on Windows
How to Pass an Argument to a Function Pointer Parameter
How to Solve Readtimeouterror: Httpsconnectionpool(Host='Pypi.Python.Org', Port=443) with Pip