Subclassing Python dictionary to override __setitem__
I'm answering my own question, since I eventually decided that I really do want to subclass Dict, rather than creating a new mapping class, and UserDict still defers to the underlying Dict object in some cases, rather than using the provided __setitem__
.
After reading and re-reading the Python 2.6.4 source (mostly Objects/dictobject.c
, but I grepped eveywhere else to see where the various methods are used,) my understanding is that the following code is sufficient to have my __setitem__ called every time that the object is changed, and to otherwise behave exactly as a Python Dict:
Peter Hansen's suggestion got me to look more carefully at dictobject.c
, and I realised that the update method in my original answer could be simplified a bit, since the built-in dictionary constructor simply calls the built-in update method anyway. So the second update in my answer has been added to the code below (by some helpful person ;-).
class MyUpdateDict(dict):
def __init__(self, *args, **kwargs):
self.update(*args, **kwargs)
def __setitem__(self, key, value):
# optional processing here
super(MyUpdateDict, self).__setitem__(key, value)
def update(self, *args, **kwargs):
if args:
if len(args) > 1:
raise TypeError("update expected at most 1 arguments, "
"got %d" % len(args))
other = dict(args[0])
for key in other:
self[key] = other[key]
for key in kwargs:
self[key] = kwargs[key]
def setdefault(self, key, value=None):
if key not in self:
self[key] = value
return self[key]
I've tested it with this code:def test_updates(dictish):
dictish['abc'] = 123
dictish.update({'def': 234})
dictish.update(red=1, blue=2)
dictish.update([('orange', 3), ('green',4)])
dictish.update({'hello': 'kitty'}, black='white')
dictish.update({'yellow': 5}, yellow=6)
dictish.setdefault('brown',7)
dictish.setdefault('pink')
try:
dictish.update({'gold': 8}, [('purple', 9)], silver=10)
except TypeError:
pass
else:
raise RunTimeException("Error did not occur as planned")
python_dict = dict([('b',2),('c',3)],a=1)
test_updates(python_dict)
my_dict = MyUpdateDict([('b',2),('c',3)],a=1)
test_updates(my_dict)
and it passes. All other implementations I've tried have failed at some point. I'll still accept any answers that show me that I've missed something, but otherwise, I'm ticking the checkmark beside this one in a couple of days, and calling it the right answer :) How to properly subclass dict and override __getitem__ & __setitem__
What you're doing should absolutely work. I tested out your class, and aside from a missing opening parenthesis in your log statements, it works just fine. There are only two things I can think of. First, is the output of your log statement set correctly? You might need to put a logging.basicConfig(level=logging.DEBUG)
at the top of your script.
Second, __getitem__
and __setitem__
are only called during []
accesses. So make sure you only access DictWatch
via d[key]
, rather than d.get()
and d.set()
Overriding __setitem__ in Python dict subclass doesn't work for setting values of items inside
No, there is not; you'll have to use custom list objects for those and propagate the 'changed' flag up to the parent dictionary or use a central object to track changes.
Once you retrieve an object from a dictionary, the dictionary itself has nothing to do with it anymore; you could store that object in a separate reference and pass that around without the original dictionary being involved, for example.
The other option is to always require an explicit set on the object again:
lstobj = yourdict['A']
lstobj[1][1] = 'Sue'
# explicit set to notify `yourdict`
yourdict['A'] = lstobj
How to perfectly override a dict?
You can write an object that behaves like a dict
quite easily with ABCs (Abstract Base Classes) from the collections.abc
module. It even tells you if you missed a method, so below is the minimal version that shuts the ABC up.
from collections.abc import MutableMapping
class TransformedDict(MutableMapping):
"""A dictionary that applies an arbitrary key-altering
function before accessing the keys"""
def __init__(self, *args, **kwargs):
self.store = dict()
self.update(dict(*args, **kwargs)) # use the free update to set keys
def __getitem__(self, key):
return self.store[self._keytransform(key)]
def __setitem__(self, key, value):
self.store[self._keytransform(key)] = value
def __delitem__(self, key):
del self.store[self._keytransform(key)]
def __iter__(self):
return iter(self.store)
def __len__(self):
return len(self.store)
def _keytransform(self, key):
return key
You get a few free methods from the ABC:class MyTransformedDict(TransformedDict):
def _keytransform(self, key):
return key.lower()
s = MyTransformedDict([('Test', 'test')])
assert s.get('TEST') is s['test'] # free get
assert 'TeSt' in s # free __contains__
# free setdefault, __eq__, and so on
import pickle
# works too since we just use a normal dict
assert pickle.loads(pickle.dumps(s)) == s
I wouldn't subclass dict
(or other builtins) directly. It often makes no sense, because what you actually want to do is implement the interface of a dict
. And that is exactly what ABCs are for. Subclass dict __getitem__ while maintaining original class type
How about your MyDict
just be a proxy
for the dict:
class MyDict(object):
def __init__(self, data={}):
self.data = data
def __getitem__(self, key):
if key == 'b':
print('Found "b"')
return MyDict(self.data.__getitem__(key))
def __setitem__(self, key, value):
return self.data.__setitem__(key, value)
def __repr__(self):
return self.data.__repr__()
# add more __magic__ methods as you wish
shallow_dict = MyDict({
'b': 'value'
})
x = shallow_dict['b']
deep_dict = MyDict({
'a': {
'b': 'value'
}
})
x = deep_dict['a']['b']
# assignment
deep_dict['a']['a'] = {'b': 'here'}
deep_dict['a']['a']['b']
print(deep_dict)
OUTPUT:Found "b"
Found "b"
Found "b"
{'a': {'b': 'value', 'a': {'b': 'here'}}}
As you can see when you get the self.data inside __getitem__
it just pass the result of self.data.__getitem__
by reference to a new MyDict
object. Overriding dict.update() method in subclass to prevent overwriting dict keys
Note that, per the documentation:
dict.update
takes a singleother
parameter, "either another dictionary object or an iterable of key/value pairs" (I've usedcollections.Mapping
to test for this) and "If keyword arguments are specified, the dictionary is then updated with those key/value pairs"; anddict()
takes a singleMapping
orIterable
along with optional**kwargs
(the same asupdate
accepts...).
from collections import Mapping
class DuplicateKeyError(KeyError):
pass
class UniqueKeyDict(dict):
def __init__(self, other=None, **kwargs):
super().__init__()
self.update(other, **kwargs)
def __setitem__(self, key, value):
if key in self:
msg = 'key {!r} already exists with value {!r}'
raise DuplicateKeyError(msg.format(key, self[key]))
super().__setitem__(key, value)
def update(self, other=None, **kwargs):
if other is not None:
for k, v in other.items() if isinstance(other, Mapping) else other:
self[k] = v
for k, v in kwargs.items():
self[k] = v
In use:>>> UniqueKeyDict((k, v) for k, v in ('a1', 'b2', 'c3', 'd4'))
{'c': '3', 'd': '4', 'a': '1', 'b': '2'}
>>> UniqueKeyDict((k, v) for k, v in ('a1', 'b2', 'c3', 'a4'))
Traceback (most recent call last):
File "<pyshell#8>", line 1, in <module>
UniqueKeyDict((k, v) for k, v in ('a1', 'b2', 'c3', 'a4'))
File "<pyshell#7>", line 5, in __init__
self.update(other, **kwargs)
File "<pyshell#7>", line 15, in update
self[k] = v
File "<pyshell#7>", line 10, in __setitem__
raise DuplicateKeyError(msg.format(key, self[key]))
DuplicateKeyError: "key 'a' already exists with value '1'"
and:>>> ukd = UniqueKeyDict((k, v) for k, v in ('a1', 'b2', 'c3', 'd4'))
>>> ukd.update((k, v) for k, v in ('e5', 'f6')) # single Iterable
>>> ukd.update({'h': 8}, g='7') # single Mapping plus keyword args
>>> ukd
{'e': '5', 'f': '6', 'a': '1', 'd': '4', 'c': '3', 'h': 8, 'b': '2', 'g': '7'}
If you ever end up using this, I'd be inclined to give it a different __repr__
to avoid confusion! How to dynamically override __setitem__? (no subclass)
For old-style classes, special methods were looked up on the instance each time they were needed. New-style classes only look up special methods on the type of the instance, not in the instance's dictionary itself -- that's why you see the behaviour you see.
(In case you don't know -- a new-style class is a class directly or indirectly derived from object
.)
Related Topics
Pycharm: Set Environment Variable for Run Manage.Py Task
How Does Sklearn.Svm.Svc's Function Predict_Proba() Work Internally
Disable Console Messages in Flask Server
Pyspark Dataframes - Way to Enumerate Without Converting to Pandas
How to Get a Gcp Bearer Token Programmatically with Python
How to Define a Class Constant Inside an Enum
Python Load JSON File with Utf-8 Bom Header
Animated Subplots Using Matplotlib
Counting Each Letter's Frequency in a String
How to Test a Function with Input Call
How to Save and Restore Multiple Variables in Python
Python Memory Usage of Numpy Arrays
Getting Data from Ctypes Array into Numpy
How to Reverse a Dictionary That Has Repeated Values
Create File But If Name Exists Add Number
Running a Command as a Super User from a Python Script