Why dict.get(key) instead of dict[key]?
It allows you to provide a default value if the key is missing:
dictionary.get("bogus", default_value)
returns default_value
(whatever you choose it to be), whereas
dictionary["bogus"]
would raise a KeyError
.
If omitted, default_value
is None
, such that
dictionary.get("bogus") # <-- No default specified -- defaults to None
returns None
just like
dictionary.get("bogus", None)
would.
Why is key in dict() faster than dict.get(key) in Python3?
Using the dis.dis
method from the linked question:
>>> import dis
>>> dis.dis(compile('d.get(key)', '', 'eval'))
1 0 LOAD_NAME 0 (d)
2 LOAD_METHOD 1 (get)
4 LOAD_NAME 2 (key)
6 CALL_METHOD 1
8 RETURN_VALUE
>>> dis.dis(compile('key in d', '', 'eval'))
1 0 LOAD_NAME 0 (key)
2 LOAD_NAME 1 (d)
4 COMPARE_OP 6 (in)
6 RETURN_VALUE
we can clearly see that d.get(key)
has to run one more step: the LOAD_METHOD
step. Additionally, d.get
must deal with more information: it has to:
- check for the presence
- if it was found, return the value
- otherwise, return the specified default value (or
None
if no default was specified).
Also, from looking at the C code for in
and the C code for .get
, we can see that they are very similar.
int static PyObject *
PyDict_Contains(PyObject *op, PyObject *key) dict_get_impl(PyDictObject *self, PyObject *key, PyObject *default_value)
{ {
Py_hash_t hash; PyObject *val = NULL;
Py_ssize_t ix; Py_hash_t hash;
PyDictObject *mp = (PyDictObject *)op; Py_ssize_t ix;
PyObject *value;
if (!PyUnicode_CheckExact(key) || if (!PyUnicode_CheckExact(key) ||
(hash = ((PyASCIIObject *) key)->hash) == -1) { (hash = ((PyASCIIObject *) key)->hash) == -1) {
hash = PyObject_Hash(key); hash = PyObject_Hash(key);
if (hash == -1) if (hash == -1)
return -1; return NULL;
} }
ix = (mp->ma_keys->dk_lookup)(mp, key, hash, &value); ix = (self->ma_keys->dk_lookup) (self, key, hash, &val);
if (ix == DKIX_ERROR) if (ix == DKIX_ERROR)
return -1; return NULL;
return (ix != DKIX_EMPTY && value != NULL); if (ix == DKIX_EMPTY || val == NULL) {
} val = default_value;
}
Py_INCREF(val);
return val;
}
In fact, they are almost the same, but .get
has more overhead and must return a value.
However, it seems that d in key
will use a faster method if the hash is known, while d.get
recalculates the hash every time. Additionally, CALL_METHOD
and LOAD_METHOD
have much higher overhead than COMPARE_OP
, which performs one of the built-in boolean operations. Note that COMPARE_OP will simply jump to here.
Why did dict.get(key) work but not dict[key]?
The problem is mutability:
one_groups = dict.fromkeys(range(5), [])
- this passes the same list as value to all keys. So if you change one value, you change them all.
It's basically the same as saying:
tmp = []
one_groups = dict.fromkeys(range(5), tmp)
del tmp
If you want to use a new list, you need to do it in a loop - either an explicit for
loop or in a dict comprehension:
one_groups = {key: [] for key in range(5)}
This thing will "execute" []
(which equals to list()
) for every key, thus making the values with different lists.
Why does get
work? Because you explicitly take the current list, but +
makes a new result list. And it doesn't matter whether it's one_groups[x.count('1')] = one_groups.get(x.count('1')) + [x]
or one_groups[x.count('1')] = one_groups[x.count('1')] + [x]
- what matters is that there's +
.
I know how everybody says a+=b
is just a=a+b
, but the implementation may be different for optimisation - in case of lists, +=
is just .extend
because we know we want our result in the current variable, so creating new list would be waste of memory.
Python dictionaries - difference between dict.get(key) and dict.get(key, {})
The second parameter to dict.get
is optional: it's what's returned if the key isn't found. If you don't supply it, it will return None
.
So:
>>> d = {'a':1, 'b':2}
>>> d.get('c')
None
>>> d.get('c', {})
{}
Why do `key in dict` and `key in dict.keys()` have the same output?
To understand why key in dct
returns the same result as key in dct.keys()
one needs to look in the past. Historically in Python 2, one would test the existence of a key in dictionary dct
with dct.has_key(key)
. This was changed for Python 2.2, when the preferred way became key in dct
, which basically did the same thing:
In a minor related change, the
in
operator now works on dictionaries, sokey in dict
is now equivalent todict.has_key(key)
The behaviour of in
is implemented internally in terms of the __contains__
dunder method. Its behaviour is documented in the Python language reference - 3 Data Model:
object.__contains__(self, item)
Called to implement membership test operators. Should return true if item is in
self
, false otherwise. For mapping objects, this should consider the keys of the mapping rather than the values or the key-item pairs.
For objects that don’t define__contains__()
, the membership test first tries iteration via__iter__()
, then the old sequence iteration protocol via__getitem__()
, see this section in the language reference.
(emphasis mine; dictionaries in Python are mapping objects)
In Python 3, the has_key
method was removed altogether and now there the correct way to test for the existence of a key is solely key in dict
, as documented.
In contrast with the 2 above, key in dct.keys()
has never been the correct way of testing whether a key exists in a dictionary.
The result of both your examples is indeed the same, however key in dct.keys()
is slightly slower on Python 3 and is abysmally slow on Python 2.
key in dct
returns true, if the key
is found as a key in the dct
in almost constant time operation - it does not matter whether there are two or a million keys - its time complexity is constant on average case (O(1))
dct.keys()
in Python 2 creates a list
of all keys; and in Python 3 a view of keys; both of these objects understand the key in x
. With Python 2 it works like for any iterable; the values are iterated over and True
is returned as soon as one value is equal to the given value (here key
).
In practice, in Python 2 you'd find key in dct.keys()
much slower than key in dict
(key in dct.keys()
scales linearly with the number of keys - its time complexity is O(n) - both dct.keys()
, which builds a list of all keys, and key in key_list
are O(n))
In Python 3, the key in dct.keys()
won't be much slower than key in dct
as the view does not make a list of the keys, and the access still would be O(1), however in practice it would be slower by at least a constant value, and it is 7 more characters, so there is usually practically no reason to use it, even if on Python 3.
dict.get(key, default) vs dict.get(key) or default
There is a huge difference if your value is false-y:
>>> d = {'foo': 0}
>>> d.get('foo', 'bar')
0
>>> d.get('foo') or 'bar'
'bar'
You should not use or default
if your values can be false-y.
On top of that, using or
adds additional bytecode; a test and jump has to be performed. Just use dict.get()
, there is no advantage to using or default
here.
Why does dict.get(key) run slower than dict[key]
Python has to do more work for dict.get()
:
get
is an attribute, so Python has to look this up, and then bind the descriptor found to the dictionary instance.()
is a call, so the current frame has to be pushed on the stack, a call has to be made, then the frame has to be popped again from the stack to continue.
The [...]
notation, used with a dict
, doesn't require a separate attribute step or frame push and pop.
You can see the difference when you use the Python bytecode disassembler dis
:
>>> import dis
>>> dis.dis(compile('d[key]', '', 'eval'))
1 0 LOAD_NAME 0 (d)
3 LOAD_NAME 1 (key)
6 BINARY_SUBSCR
7 RETURN_VALUE
>>> dis.dis(compile('d.get(key)', '', 'eval'))
1 0 LOAD_NAME 0 (d)
3 LOAD_ATTR 1 (get)
6 LOAD_NAME 2 (key)
9 CALL_FUNCTION 1
12 RETURN_VALUE
so the d[key]
expression only has to execute a BINARY_SUBSCR
opcode, while d.get(key)
adds a LOAD_ATTR
opcode. CALL_FUNCTION
is a lot more expensive than BINARY_SUBSCR
on a built-in type (custom types with __getitem__
methods still end up doing a function call).
If the majority of your keys exist in the dictionary, you could use try...except KeyError
to handle missing keys:
try:
return mydict['name']
except KeyError:
return None
Exception handling is cheap if there are no exceptions.
Why use dict.keys?
On Python 3, use dct.keys()
to get a dictionary view object, which lets you do set operations on just the keys:
>>> for sharedkey in dct1.keys() & dct2.keys(): # intersection of two dictionaries
... print(dct1[sharedkey], dct2[sharedkey])
In Python 2.7, you'd use dct.viewkeys()
for that.
In Python 2, dct.keys()
returns a list, a copy of the keys in the dictionary. This can be passed around an a separate object that can be manipulated in its own right, including removing elements without affecting the dictionary itself; however, you can create the same list with list(dct)
, which works in both Python 2 and 3.
You indeed don't want any of these for iteration or membership testing; always use for key in dct
and key in dct
for those, respectively.
Python dict.get() or None scenario
You don't need or None
at all. dict.get
returns None
by default when it can't find the provided key in the dictionary.
It's a good idea to consult the documentation in these cases:
get(key[, default])
Return the value for key if key is in the dictionary, else default. If default is not given, it defaults to None, so that this method never raises a KeyError.
Related Topics
How to Import a Module Given the Full Path
How to List All Files of a Directory
How to Install R Packages That Are Not Available in "R-Essentials"
How to Detect Collision in Pygame
What Does Ruby Have That Python Doesn'T, and Vice Versa
How to Avoid Having Class Data Shared Among Instances
Call Python Code from an Existing Project Written in Swift
"Least Astonishment" and the Mutable Default Argument
Are Dictionaries Ordered in Python 3.6+
How to Make a Flat List Out of a List of Lists
Is There a Difference Between "==" and "Is"
What Does ** (Double Star/Asterisk) and * (Star/Asterisk) Do For Parameters
Is There a Built in Function For String Natural Sort
How to Execute a Program or Call a System Command
Convert Xml/Html Entities into Unicode String in Python