How to Properly Subclass Dict and Override _Getitem_ & _Setitem_

How to properly subclass dict and override __getitem__ & __setitem__

What you're doing should absolutely work. I tested out your class, and aside from a missing opening parenthesis in your log statements, it works just fine. There are only two things I can think of. First, is the output of your log statement set correctly? You might need to put a logging.basicConfig(level=logging.DEBUG) at the top of your script.

Second, __getitem__ and __setitem__ are only called during [] accesses. So make sure you only access DictWatch via d[key], rather than d.get() and d.set()

Subclass dict __getitem__ while maintaining original class type

How about your MyDict just be a proxy for the dict:

class MyDict(object):
def __init__(self, data={}):
self.data = data

def __getitem__(self, key):
if key == 'b':
print('Found "b"')
return MyDict(self.data.__getitem__(key))

def __setitem__(self, key, value):
return self.data.__setitem__(key, value)

def __repr__(self):
return self.data.__repr__()

# add more __magic__ methods as you wish

shallow_dict = MyDict({
'b': 'value'
})

x = shallow_dict['b']

deep_dict = MyDict({
'a': {
'b': 'value'
}
})

x = deep_dict['a']['b']

# assignment
deep_dict['a']['a'] = {'b': 'here'}
deep_dict['a']['a']['b']
print(deep_dict)

OUTPUT:

Found "b"
Found "b"
Found "b"
{'a': {'b': 'value', 'a': {'b': 'here'}}}

As you can see when you get the self.data inside __getitem__ it just pass the result of self.data.__getitem__ by reference to a new MyDict object.

How to perfectly override a dict?

You can write an object that behaves like a dict quite easily with ABCs (Abstract Base Classes) from the collections.abc module. It even tells you if you missed a method, so below is the minimal version that shuts the ABC up.

from collections.abc import MutableMapping

class TransformedDict(MutableMapping):
"""A dictionary that applies an arbitrary key-altering
function before accessing the keys"""

def __init__(self, *args, **kwargs):
self.store = dict()
self.update(dict(*args, **kwargs)) # use the free update to set keys

def __getitem__(self, key):
return self.store[self._keytransform(key)]

def __setitem__(self, key, value):
self.store[self._keytransform(key)] = value

def __delitem__(self, key):
del self.store[self._keytransform(key)]

def __iter__(self):
return iter(self.store)

def __len__(self):
return len(self.store)

def _keytransform(self, key):
return key

You get a few free methods from the ABC:

class MyTransformedDict(TransformedDict):

def _keytransform(self, key):
return key.lower()

s = MyTransformedDict([('Test', 'test')])

assert s.get('TEST') is s['test'] # free get
assert 'TeSt' in s # free __contains__
# free setdefault, __eq__, and so on

import pickle
# works too since we just use a normal dict
assert pickle.loads(pickle.dumps(s)) == s

I wouldn't subclass dict (or other builtins) directly. It often makes no sense, because what you actually want to do is implement the interface of a dict. And that is exactly what ABCs are for.

Subclassing Python dictionary to override __setitem__

I'm answering my own question, since I eventually decided that I really do want to subclass Dict, rather than creating a new mapping class, and UserDict still defers to the underlying Dict object in some cases, rather than using the provided __setitem__.

After reading and re-reading the Python 2.6.4 source (mostly Objects/dictobject.c, but I grepped eveywhere else to see where the various methods are used,) my understanding is that the following code is sufficient to have my __setitem__ called every time that the object is changed, and to otherwise behave exactly as a Python Dict:

Peter Hansen's suggestion got me to look more carefully at dictobject.c, and I realised that the update method in my original answer could be simplified a bit, since the built-in dictionary constructor simply calls the built-in update method anyway. So the second update in my answer has been added to the code below (by some helpful person ;-).

class MyUpdateDict(dict):
def __init__(self, *args, **kwargs):
self.update(*args, **kwargs)

def __setitem__(self, key, value):
# optional processing here
super(MyUpdateDict, self).__setitem__(key, value)

def update(self, *args, **kwargs):
if args:
if len(args) > 1:
raise TypeError("update expected at most 1 arguments, "
"got %d" % len(args))
other = dict(args[0])
for key in other:
self[key] = other[key]
for key in kwargs:
self[key] = kwargs[key]

def setdefault(self, key, value=None):
if key not in self:
self[key] = value
return self[key]

I've tested it with this code:

def test_updates(dictish):
dictish['abc'] = 123
dictish.update({'def': 234})
dictish.update(red=1, blue=2)
dictish.update([('orange', 3), ('green',4)])
dictish.update({'hello': 'kitty'}, black='white')
dictish.update({'yellow': 5}, yellow=6)
dictish.setdefault('brown',7)
dictish.setdefault('pink')
try:
dictish.update({'gold': 8}, [('purple', 9)], silver=10)
except TypeError:
pass
else:
raise RunTimeException("Error did not occur as planned")

python_dict = dict([('b',2),('c',3)],a=1)
test_updates(python_dict)

my_dict = MyUpdateDict([('b',2),('c',3)],a=1)
test_updates(my_dict)

and it passes. All other implementations I've tried have failed at some point. I'll still accept any answers that show me that I've missed something, but otherwise, I'm ticking the checkmark beside this one in a couple of days, and calling it the right answer :)

Overriding __setitem__ in Python dict subclass doesn't work for setting values of items inside

No, there is not; you'll have to use custom list objects for those and propagate the 'changed' flag up to the parent dictionary or use a central object to track changes.

Once you retrieve an object from a dictionary, the dictionary itself has nothing to do with it anymore; you could store that object in a separate reference and pass that around without the original dictionary being involved, for example.

The other option is to always require an explicit set on the object again:

lstobj = yourdict['A']
lstobj[1][1] = 'Sue'
# explicit set to notify `yourdict`
yourdict['A'] = lstobj

Subclass dict: UserDict, dict or ABC?

If you want a custom collection that actually holds the data, subclass dict. This is especially useful if you want to extend the interface (e.g., add methods).

None of the built-in methods will call your custom __getitem__ / __setitem__, though. If you need total control over these, create a custom class that implements the collections.MutableMapping abstract base class instead.

The ABC does not provide a means to store the actual data, only an interface with default implementations for some methods. These default implementations will, however, call your custom __getitem__ and __setitem__. You will have to use an internal dict to hold the data, and implement all abstract methods: __len__, __iter__, __getitem__, __setitem__, and __delitem__.

The class UserDict from the collections module (in Python 2, the module is called UserDict as well) is a wrapper around an internal dict, implementing the MutableMapping ABC. If you want to customize the behavior of a dict, this implementation could be a starting point.

In summary:

  • MutableMapping defines the interface. Subclass this to create something that acts like a dict. It's totally up to you if and how you store the data.
  • UserDict is an implementation of MutableMapping using an internal "real" dict as storage. If you want a dict-like storage collection but override some methods exposed by dict, this might be a good starting point for you. But make sure to read the code to know how the basic methods are implemented, so that you are consistent when overriding a method.
  • dict is "the real thing". Subclass this if you want to extend the interface. Overriding methods to do custom things might be dangerous, as there are usually multiple ways of accessing the data, and you could end up with an inconsistent API.

kwargs overriding dict subclass

Found the answer here, and there should be a way to make this easier to find:
Does argument unpacking use iteration or item-getting?

Summary:
Inheriting from builtins (dict,list...) is problematic since python may ignore your python API entirely (as it did for me) and use the underlying C. This happens in argument unpacking.

Solution: use the available abstractions. In my example add:

from UserDict import UserDict

and replace all "dict" occurrences in the code with UserDict. This solution should be true for lists and tuples.



Related Topics



Leave a reply



Submit