How Does Collections.Defaultdict Work

How does collections.defaultdict work?

Usually, a Python dictionary throws a KeyError if you try to get an item with a key that is not currently in the dictionary. The defaultdict in contrast will simply create any items that you try to access (provided of course they do not exist yet). To create such a "default" item, it calls the function object that you pass to the constructor (more precisely, it's an arbitrary "callable" object, which includes function and type objects). For the first example, default items are created using int(), which will return the integer object 0. For the second example, default items are created using list(), which returns a new empty list object.

Understanding the use of defaultdict in Python

From the documentation of defaultdict:

If default_factory is not None, it is called without arguments to provide a default value for the given key, this value is inserted in the dictionary for the key, and returned.

Since "Joel" doesn't exist as key yet the dd_dict["Joel"] part creates an empty dictionary as value for the key "Joel". The following part ["City"] = "Seattle" is just like adding a normal key-value pair a dictionary - in this case the dd_dict["Joel"] dictionary.

What is the difference between dict and collections.defaultdict?

The difference is that a defaultdict will "default" a value if that key has not been set yet. If you didn't use a defaultdict you'd have to check to see if that key exists, and if it doesn't, set it to what you want.

The lambda is defining a factory for the default value. That function gets called whenever it needs a default value. You could hypothetically have a more complicated default function.

Help on class defaultdict in module collections:

class defaultdict(__builtin__.dict)
 |  defaultdict(default_factory) --> dict with default factory
 |  
 |  The default factory is called without arguments to produce
 |  a new value when a key is not present, in __getitem__ only.
 |  A defaultdict compares equal to a dict with the same items.
 |

(from help(type(collections.defaultdict())))

{}.setdefault is similar in nature, but takes in a value instead of a factory function. It's used to set the value if it doesn't already exist... which is a bit different, though.

how collections.defaultdict.get work in max statement's key paramter--- python

max accepts a keyword argument -- a "key" function. e.g.:

max(iterable, key=some_function)

^{^{Which (I'm guessing) is how you're using it (instead of max(iterable, function))}}

The "key" function will be called for every element in the iterable and the result of the "key" function is used to compare elements.

So, in your case, the element for which d.get returns the maximal value will be returned.

d is your defaultdict. d.get(key) returns the value associated with that key -- and the things which are getting passed to it are keys that are in d. So you're picking out the key which has the maximal value.

`dict.pop` ignores the default value set by `collections.defaultdict(default_factory)`

The documentation you linked states that:

It overrides one method [__missing__()] and adds one writable instance variable [default_factory]. The remaining functionality is the same as for the dict class and is not documented here.

That is further specified under the __missing__() method itself, which is called by __getitem__() on a dict:

Note that __missing__() is not called for any operations besides __getitem__(). This means that get() will, like normal dictionaries, return None as a default rather than using default_factory.

So not only pop() will have the same behaviour, get() will too. The only way to have the default value would be to straight up use [key] on your dict. And if we think about it, it's definitely the most relevant call on a dict.

In summary, defaultDict will make dict['inexistent-key'] return your default value, anythying else should have the same behaviour as a normal dict.

defaultdict of defaultdict?

Yes like this:

defaultdict(lambda: defaultdict(int))

The argument of a defaultdict (in this case is lambda: defaultdict(int)) will be called when you try to access a key that doesn't exist. The return value of it will be set as the new value of this key, which means in our case the value of d[Key_doesnt_exist] will be defaultdict(int).

If you try to access a key from this last defaultdict i.e. d[Key_doesnt_exist][Key_doesnt_exist] it will return 0, which is the return value of the argument of the last defaultdict i.e. int().

python collections.defaultdict() compile error

you overwrite the internal list, being the name of a type, with your list = ['aema', 'airplane', 'amend'] above. Rename your list to e.g. keys or keylist and all will be fine.

So replace

list = ['aema', 'airplane', 'amend']

with

keys = ['aema', 'airplane', 'amend']

and

for x in list:

with

for x in keys:

How does object work in the collections.defaultdict example %(object)s?

The string formatting line told Python to plug the value of d['object'] into the string. The way a defaultdict works is that if you refer to a key that is not there, it will create an entry with that key and the default value from the factory you gave it. So in this case, when the format string referred to d['object'], the defaultdict created an entry with a key of 'object' and a value of '<missing>' and duly plugged the value into the string.

I would guess that the output of the contents of d you showed was run before the format string reference created the 'object':'<missing>' entry.

For bonus points, so-called "Old Style" string formatting operations with % are documented at https://docs.python.org/2/library/stdtypes.html#string-formatting. Different string formatting facilities were introduced in Python 3.

How Does Collections.Defaultdict Work