Defaultdict of Defaultdict

defaultdict of defaultdict?

Yes like this:

defaultdict(lambda: defaultdict(int))

The argument of a defaultdict (in this case is lambda: defaultdict(int)) will be called when you try to access a key that doesn't exist. The return value of it will be set as the new value of this key, which means in our case the value of d[Key_doesnt_exist] will be defaultdict(int).

If you try to access a key from this last defaultdict i.e. d[Key_doesnt_exist][Key_doesnt_exist] it will return 0, which is the return value of the argument of the last defaultdict i.e. int().

Nested defaultdict of defaultdict

For an arbitrary number of levels:

def rec_dd():
    return defaultdict(rec_dd)

>>> x = rec_dd()
>>> x['a']['b']['c']['d']
defaultdict(<function rec_dd at 0x7f0dcef81500>, {})
>>> print json.dumps(x)
{"a": {"b": {"c": {"d": {}}}}}

Of course you could also do this with a lambda, but I find lambdas to be less readable. In any case it would look like this:

rec_dd = lambda: defaultdict(rec_dd)

defaultdict of defaultdict of int

Try this:

from collections import defaultdict

m = defaultdict(lambda: defaultdict(int))
m['a']['b'] += 1

Note if you want more depth, you can still use a recursive approach:

def ddict(some_type, depth=0):
    if depth == 0:
        return defaultdict(some_type)
    else:
        return defaultdict(lambda: ddict(some_type, depth-1))

m = ddict(int, depth=2)
m['a']['b']['c'] += 1

How does collections.defaultdict work?

Usually, a Python dictionary throws a KeyError if you try to get an item with a key that is not currently in the dictionary. The defaultdict in contrast will simply create any items that you try to access (provided of course they do not exist yet). To create such a "default" item, it calls the function object that you pass to the constructor (more precisely, it's an arbitrary "callable" object, which includes function and type objects). For the first example, default items are created using int(), which will return the integer object 0. For the second example, default items are created using list(), which returns a new empty list object.

How to default a nested defaultdict() to a list of specified length?

You can specify that you want a default value of a defaultdict that has a default value of [0, 0, 0]

from collections import defaultdict

dic01 = defaultdict(lambda: defaultdict(lambda: [0, 0, 0]))

dic01['year2018']['jul_to_sep_temperature'][0] = 25

print(dic01)

prints

defaultdict(<function <lambda> at 0x7f4dc28ac598>, {'year2018': defaultdict(<function <lambda>.<locals>.<lambda> at 0x7f4dc28ac510>, {'jul_to_sep_temperature': [25, 0, 0]})})

Which you can treat as a regular nested dictionary

Nested `defaultdict of defaultdict of defaultdict` each with a backreference

Solution 1

Giving the same {BACKREF: node} to defaultdict:

from collections import defaultdict

BACKREF, END = 'BACKREF', '$'
words = ['hi', 'hello', 'hiya', 'hey']

tree = lambda: defaultdict(tree, {BACKREF: node})
node = None
root = tree()
for word in words:
  node = root
  for ch in word:
    node = node[ch]
  node[END] = None

The root node has a backref None, can be deleted if bothersome.

Solution 2

The above works fine if that code is the only code that creates tree nodes (seems likely to me, judging by the times I've built such trees myself). Otherwise you'd need to ensure that node points to the correct parent node. If that's an issue, here's an alternative with a dict (not defaultdict) subclass that implements __missing__ to automatically create children with backrefs when needed:

BACKREF, END = 'BACKREF', '$'
words = ['hi', 'hello', 'hiya', 'hey']

class Tree(dict):
    def __missing__(self, key):
        child = self[key] = Tree({BACKREF: self})
        return child

root = Tree()
for word in words:
  node = root
  for ch in word:
    node = node[ch]
  node[END] = None

Also doesn't give the root a backref, and being a dict, its string representations are far less cluttered than a defaultdict's and thus far more readable:

>>> import pprint
>>> pprint.pp(root)
{'h': {'BACKREF': <Recursion on Tree with id=2494556270320>,
       'i': {'BACKREF': <Recursion on Tree with id=2494556270400>,
             '$': None,
             'y': {'BACKREF': <Recursion on Tree with id=2494556270480>,
                   'a': {'BACKREF': <Recursion on Tree with id=2494556340608>,
                         '$': None}}},
       'e': {'BACKREF': <Recursion on Tree with id=2494556270400>,
             'l': {'BACKREF': <Recursion on Tree with id=2494556340288>,
                   'l': {'BACKREF': <Recursion on Tree with id=2494556340368>,
                         'o': {'BACKREF': <Recursion on Tree with id=2494556340448>,
                               '$': None}}},
             'y': {'BACKREF': <Recursion on Tree with id=2494556340288>,
                   '$': None}}}}

The defaultdict result for comparison:

>>> pprint.pp(root)
defaultdict(<function <lambda> at 0x000001A13760BE50>,
            {'BACKREF': None,
             'h': defaultdict(<function <lambda> at 0x000001A13760BE50>,
                              {'BACKREF': <Recursion on defaultdict with id=1791930855152>,
                               'i': defaultdict(<function <lambda> at 0x000001A13760BE50>,
                                                {'BACKREF': <Recursion on defaultdict with id=1791930855312>,
                                                 '$': None,
                                                 'y': defaultdict(<function <lambda> at 0x000001A13760BE50>,
                                                                  {'BACKREF': <Recursion on defaultdict with id=1791930912832>,
                                                                   'a': defaultdict(<function <lambda> at 0x000001A13760BE50>,
                                                                                    {'BACKREF': <Recursion on defaultdict with id=1791930913232>,
                                                                                     '$': None})})}),
                               'e': defaultdict(<function <lambda> at 0x000001A13760BE50>,
                                                {'BACKREF': <Recursion on defaultdict with id=1791930855312>,
                                                 'l': defaultdict(<function <lambda> at 0x000001A13760BE50>,
                                                                  {'BACKREF': <Recursion on defaultdict with id=1791930912912>,
                                                                   'l': defaultdict(<function <lambda> at 0x000001A13760BE50>,
                                                                                    {'BACKREF': <Recursion on defaultdict with id=1791930912992>,
                                                                                     'o': defaultdict(<function <lambda> at 0x000001A13760BE50>,
                                                                                                      {'BACKREF': <Recursion on defaultdict with id=1791930913072>,
                                                                                                       '$': None})})}),
                                                 'y': defaultdict(<function <lambda> at 0x000001A13760BE50>,
                                                                  {'BACKREF': <Recursion on defaultdict with id=1791930912912>,
                                                                   '$': None})})})})

Python Defaultdict with defined dictionary as default value shares the same dictionary between keys

To fix this, you simply need to move your counter_format dictionary construction into the lambda, so that a new counter_format dictionary is created each time you try to access a missing value in your innermost defaultdict

from collections import defaultdict

userDict = defaultdict(lambda: 
    defaultdict(lambda: {"correct": 0, "incorrect": 0})
)

userDict['a'][1]['correct']+=1
userDict['b'][1]['correct']+=4
userDict['b'][2]['correct']+=1

print(userdict)
defaultdict(<function <lambda> at 0x7f0f6cb38550>,
            {'a': defaultdict(<function <lambda>.<locals>.<lambda> at 0x7f0f679424c0>,
                              {1: {'correct': 1, 'incorrect': 0}}),
             'b': defaultdict(<function <lambda>.<locals>.<lambda> at 0x7f0f4caaa430>,
                              {1: {'correct': 4, 'incorrect': 0},
                               2: {'correct': 1, 'incorrect': 0}})})

defaultdict with default value 1?

Short answer (as per Montaro's answer below)

defaultdict(lambda:1)

Long answer on how defaultdicts work

ht = {}
ht = defaultdict(lambda:0, ht)

defaultdicts are different from dict in that when you try to access a regular dict with a key that does not exists, it raises a KeyError.
defaultdict, however, doesn't raise an error: it creates the key for you instead. With which value? With the return of the callable you passed as an argument. In this case, every new keys will be created with value 0 (which is the return of the simple lambda function lambda:0), which also happens to be the same return of int() , so in this case, there would be no difference in changing the default function to int().

Breaking down this line in more detail: ht = defaultdict(lambda:0, ht)

The first argument is a function, which is a callable object. This is the function that will be called to create a new value for an inexistent key. The second argument, ht is optional and refers to the base dictionary that the new defaultdict will be built on. Therefore, if ht had some keys and values, the defaultdict would also have these keys with the corresponding values. If you tried to access these keys, you would get the old values.
However, if you did not pass the base dictionary, a brand new defaultdict would be created, and thus, all new keys accessed would get the default value returned from the callable.

(In this case, as ht is initially an empty dict, there would be no difference at all in doing ht = defaultdict(lambda:0) , ht = defaultdict(int) or ht = defaultdict(lambda:0, ht) : they would all build the same defaultdict.

How to convert defaultdict of defaultdicts [of defaultdicts] to dict of dicts [of dicts]?

You can recurse over the tree, replacing each defaultdict instance with a dict produced by a dict comprehension:

def default_to_regular(d):
    if isinstance(d, defaultdict):
        d = {k: default_to_regular(v) for k, v in d.items()}
    return d

Demo:

>>> from collections import defaultdict
>>> factory = lambda: defaultdict(factory)
>>> defdict = factory()
>>> defdict['one']['two']['three']['four'] = 5
>>> defdict
defaultdict(<function <lambda> at 0x103098ed8>, {'one': defaultdict(<function <lambda> at 0x103098ed8>, {'two': defaultdict(<function <lambda> at 0x103098ed8>, {'three': defaultdict(<function <lambda> at 0x103098ed8>, {'four': 5})})})})
>>> default_to_regular(defdict)
{'one': {'two': {'three': {'four': 5}}}}

Meaning of defaultdict(lambda: defaultdict(dict))

Let's resolve it from the inside out. Firstly, dict is the dictionary type. Like other types, calling it creates an instance (also known as object) of that type. A defaultdict is a type that takes a callable parameter: something that, when called, produces an item to put in the dictionary. This happens when an entry is accessed that was not present, instead of producing a KeyError like an ordinary dict. Thirdly, lambda is a way to create unnamed functions based on a single expression, so these two are similar (the second holds a function that knows its own name, the first doesn't):

y = lambda: defaultdict(dict)

def y():
    return defaultdict(dict)

And finally the whole thing is wrapped in another defaultdict. So the result is that x is a defaultdict that produces defaultdicts that produce dict instances. At the third level there aren't defaults anymore.