Subclassing Python Dictionary to Override _Setitem_

Subclassing Python dictionary to override __setitem__

I'm answering my own question, since I eventually decided that I really do want to subclass Dict, rather than creating a new mapping class, and UserDict still defers to the underlying Dict object in some cases, rather than using the provided __setitem__.

After reading and re-reading the Python 2.6.4 source (mostly Objects/dictobject.c, but I grepped eveywhere else to see where the various methods are used,) my understanding is that the following code is sufficient to have my __setitem__ called every time that the object is changed, and to otherwise behave exactly as a Python Dict:

Peter Hansen's suggestion got me to look more carefully at dictobject.c, and I realised that the update method in my original answer could be simplified a bit, since the built-in dictionary constructor simply calls the built-in update method anyway. So the second update in my answer has been added to the code below (by some helpful person ;-).

class MyUpdateDict(dict):
def __init__(self, *args, **kwargs):
self.update(*args, **kwargs)

def __setitem__(self, key, value):
# optional processing here
super(MyUpdateDict, self).__setitem__(key, value)

def update(self, *args, **kwargs):
if args:
if len(args) > 1:
raise TypeError("update expected at most 1 arguments, "
"got %d" % len(args))
other = dict(args[0])
for key in other:
self[key] = other[key]
for key in kwargs:
self[key] = kwargs[key]

def setdefault(self, key, value=None):
if key not in self:
self[key] = value
return self[key]

I've tested it with this code:

def test_updates(dictish):
dictish['abc'] = 123
dictish.update({'def': 234})
dictish.update(red=1, blue=2)
dictish.update([('orange', 3), ('green',4)])
dictish.update({'hello': 'kitty'}, black='white')
dictish.update({'yellow': 5}, yellow=6)
dictish.setdefault('brown',7)
dictish.setdefault('pink')
try:
dictish.update({'gold': 8}, [('purple', 9)], silver=10)
except TypeError:
pass
else:
raise RunTimeException("Error did not occur as planned")

python_dict = dict([('b',2),('c',3)],a=1)
test_updates(python_dict)

my_dict = MyUpdateDict([('b',2),('c',3)],a=1)
test_updates(my_dict)

and it passes. All other implementations I've tried have failed at some point. I'll still accept any answers that show me that I've missed something, but otherwise, I'm ticking the checkmark beside this one in a couple of days, and calling it the right answer :)

How to properly subclass dict and override __getitem__ & __setitem__

What you're doing should absolutely work. I tested out your class, and aside from a missing opening parenthesis in your log statements, it works just fine. There are only two things I can think of. First, is the output of your log statement set correctly? You might need to put a logging.basicConfig(level=logging.DEBUG) at the top of your script.

Second, __getitem__ and __setitem__ are only called during [] accesses. So make sure you only access DictWatch via d[key], rather than d.get() and d.set()

Overriding __setitem__ in Python dict subclass doesn't work for setting values of items inside

No, there is not; you'll have to use custom list objects for those and propagate the 'changed' flag up to the parent dictionary or use a central object to track changes.

Once you retrieve an object from a dictionary, the dictionary itself has nothing to do with it anymore; you could store that object in a separate reference and pass that around without the original dictionary being involved, for example.

The other option is to always require an explicit set on the object again:

lstobj = yourdict['A']
lstobj[1][1] = 'Sue'
# explicit set to notify `yourdict`
yourdict['A'] = lstobj

How to perfectly override a dict?

You can write an object that behaves like a dict quite easily with ABCs (Abstract Base Classes) from the collections.abc module. It even tells you if you missed a method, so below is the minimal version that shuts the ABC up.

from collections.abc import MutableMapping

class TransformedDict(MutableMapping):
"""A dictionary that applies an arbitrary key-altering
function before accessing the keys"""

def __init__(self, *args, **kwargs):
self.store = dict()
self.update(dict(*args, **kwargs)) # use the free update to set keys

def __getitem__(self, key):
return self.store[self._keytransform(key)]

def __setitem__(self, key, value):
self.store[self._keytransform(key)] = value

def __delitem__(self, key):
del self.store[self._keytransform(key)]

def __iter__(self):
return iter(self.store)

def __len__(self):
return len(self.store)

def _keytransform(self, key):
return key

You get a few free methods from the ABC:

class MyTransformedDict(TransformedDict):

def _keytransform(self, key):
return key.lower()

s = MyTransformedDict([('Test', 'test')])

assert s.get('TEST') is s['test'] # free get
assert 'TeSt' in s # free __contains__
# free setdefault, __eq__, and so on

import pickle
# works too since we just use a normal dict
assert pickle.loads(pickle.dumps(s)) == s

I wouldn't subclass dict (or other builtins) directly. It often makes no sense, because what you actually want to do is implement the interface of a dict. And that is exactly what ABCs are for.

Subclass dict __getitem__ while maintaining original class type

How about your MyDict just be a proxy for the dict:

class MyDict(object):
def __init__(self, data={}):
self.data = data

def __getitem__(self, key):
if key == 'b':
print('Found "b"')
return MyDict(self.data.__getitem__(key))

def __setitem__(self, key, value):
return self.data.__setitem__(key, value)

def __repr__(self):
return self.data.__repr__()

# add more __magic__ methods as you wish

shallow_dict = MyDict({
'b': 'value'
})

x = shallow_dict['b']

deep_dict = MyDict({
'a': {
'b': 'value'
}
})

x = deep_dict['a']['b']

# assignment
deep_dict['a']['a'] = {'b': 'here'}
deep_dict['a']['a']['b']
print(deep_dict)

OUTPUT:

Found "b"
Found "b"
Found "b"
{'a': {'b': 'value', 'a': {'b': 'here'}}}

As you can see when you get the self.data inside __getitem__ it just pass the result of self.data.__getitem__ by reference to a new MyDict object.

Overriding dict.update() method in subclass to prevent overwriting dict keys

Note that, per the documentation:

  • dict.update takes a single other parameter, "either another dictionary object or an iterable of key/value pairs" (I've used collections.Mapping to test for this) and "If keyword arguments are specified, the dictionary is then updated with those key/value pairs"; and
  • dict() takes a single Mapping or Iterable along with optional **kwargs (the same as update accepts...).

This is not quite the interface you have implemented, which is leading to some issues. I would have implemented this as follows:

from collections import Mapping

class DuplicateKeyError(KeyError):
pass

class UniqueKeyDict(dict):

def __init__(self, other=None, **kwargs):
super().__init__()
self.update(other, **kwargs)

def __setitem__(self, key, value):
if key in self:
msg = 'key {!r} already exists with value {!r}'
raise DuplicateKeyError(msg.format(key, self[key]))
super().__setitem__(key, value)

def update(self, other=None, **kwargs):
if other is not None:
for k, v in other.items() if isinstance(other, Mapping) else other:
self[k] = v
for k, v in kwargs.items():
self[k] = v

In use:

>>> UniqueKeyDict((k, v) for k, v in ('a1', 'b2', 'c3', 'd4'))
{'c': '3', 'd': '4', 'a': '1', 'b': '2'}
>>> UniqueKeyDict((k, v) for k, v in ('a1', 'b2', 'c3', 'a4'))
Traceback (most recent call last):
File "<pyshell#8>", line 1, in <module>
UniqueKeyDict((k, v) for k, v in ('a1', 'b2', 'c3', 'a4'))
File "<pyshell#7>", line 5, in __init__
self.update(other, **kwargs)
File "<pyshell#7>", line 15, in update
self[k] = v
File "<pyshell#7>", line 10, in __setitem__
raise DuplicateKeyError(msg.format(key, self[key]))
DuplicateKeyError: "key 'a' already exists with value '1'"

and:

>>> ukd = UniqueKeyDict((k, v) for k, v in ('a1', 'b2', 'c3', 'd4'))
>>> ukd.update((k, v) for k, v in ('e5', 'f6')) # single Iterable
>>> ukd.update({'h': 8}, g='7') # single Mapping plus keyword args
>>> ukd
{'e': '5', 'f': '6', 'a': '1', 'd': '4', 'c': '3', 'h': 8, 'b': '2', 'g': '7'}

If you ever end up using this, I'd be inclined to give it a different __repr__ to avoid confusion!

How to dynamically override __setitem__? (no subclass)

For old-style classes, special methods were looked up on the instance each time they were needed. New-style classes only look up special methods on the type of the instance, not in the instance's dictionary itself -- that's why you see the behaviour you see.

(In case you don't know -- a new-style class is a class directly or indirectly derived from object.)



Related Topics



Leave a reply



Submit