Making Object JSON Serializable with Regular Encoder

Making object JSON serializable with regular encoder

As I said in a comment to your question, after looking at the json module's source code, it does not appear to lend itself to doing what you want. However the goal could be achieved by what is known as monkey-patching
(see question What is a monkey patch?).
This could be done in your package's __init__.py initialization script and would affect all subsequent json module serialization since modules are generally only loaded once and the result is cached in sys.modules.

The patch changes the default json encoder's default method—the default default().

Here's an example implemented as a standalone module for simplicity's sake:

Module: make_json_serializable.py

""" Module that monkey-patches json module when it's imported so
JSONEncoder.default() automatically checks for a special "to_json()"
method and uses it to encode the object if found.
"""
from json import JSONEncoder

def _default(self, obj):
return getattr(obj.__class__, "to_json", _default.default)(obj)

_default.default = JSONEncoder.default # Save unmodified default.
JSONEncoder.default = _default # Replace it.

Using it is trivial since the patch is applied by simply importing the module.

Sample client script:

import json
import make_json_serializable # apply monkey-patch

class Foo(object):
def __init__(self, name):
self.name = name
def to_json(self): # New special method.
""" Convert to JSON format string representation. """
return '{"name": "%s"}' % self.name

foo = Foo('sazpaz')
print(json.dumps(foo)) # -> "{\"name\": \"sazpaz\"}"

To retain the object type information, the special method can also include it in the string returned:

        return ('{"type": "%s", "name": "%s"}' %
(self.__class__.__name__, self.name))

Which produces the following JSON that now includes the class name:

"{\"type\": \"Foo\", \"name\": \"sazpaz\"}"

Magick Lies Here

Even better than having the replacement default() look for a specially named method, would be for it to be able to serialize most Python objects automatically, including user-defined class instances, without needing to add a special method. After researching a number of alternatives, the following — based on an answer by @Raymond Hettinger to another question — which uses the pickle module, seemed closest to that ideal to me:

Module: make_json_serializable2.py

""" Module that imports the json module and monkey-patches it so
JSONEncoder.default() automatically pickles any Python objects
encountered that aren't standard JSON data types.
"""
from json import JSONEncoder
import pickle

def _default(self, obj):
return {'_python_object': pickle.dumps(obj)}

JSONEncoder.default = _default # Replace with the above.

Of course everything can't be pickled—extension types for example. However there are ways defined to handle them via the pickle protocol by writing special methods—similar to what you suggested and I described earlier—but doing that would likely be necessary for a far fewer number of cases.

Deserializing

Regardless, using the pickle protocol also means it would be fairly easy to reconstruct the original Python object by providing a custom object_hook function argument on any json.loads() calls that used any '_python_object' key in the dictionary passed in, whenever it has one. Something like:

def as_python_object(dct):
try:
return pickle.loads(str(dct['_python_object']))
except KeyError:
return dct

pyobj = json.loads(json_str, object_hook=as_python_object)

If this has to be done in many places, it might be worthwhile to define a wrapper function that automatically supplied the extra keyword argument:

json_pkloads = functools.partial(json.loads, object_hook=as_python_object)

pyobj = json_pkloads(json_str)

Naturally, this could be monkey-patched it into the json module as well, making the function the default object_hook (instead of None).

I got the idea for using pickle from an answer by Raymond Hettinger to another JSON serialization question, whom I consider exceptionally credible as well as an official source (as in Python core developer).

Portability to Python 3

The code above does not work as shown in Python 3 because json.dumps() returns a bytes object which the JSONEncoder can't handle. However the approach is still valid. A simple way to workaround the issue is to latin1 "decode" the value returned from pickle.dumps() and then "encode" it from latin1 before passing it on to pickle.loads() in the as_python_object() function. This works because arbitrary binary strings are valid latin1 which can always be decoded to Unicode and then encoded back to the original string again (as pointed out in this answer by Sven Marnach).

(Although the following works fine in Python 2, the latin1 decoding and encoding it does is superfluous.)

from decimal import Decimal

class PythonObjectEncoder(json.JSONEncoder):
def default(self, obj):
return {'_python_object': pickle.dumps(obj).decode('latin1')}

def as_python_object(dct):
try:
return pickle.loads(dct['_python_object'].encode('latin1'))
except KeyError:
return dct

class Foo(object): # Some user-defined class.
def __init__(self, name):
self.name = name

def __eq__(self, other):
if type(other) is type(self): # Instances of same class?
return self.name == other.name
return NotImplemented

__hash__ = None

data = [1,2,3, set(['knights', 'who', 'say', 'ni']), {'key':'value'},
Foo('Bar'), Decimal('3.141592653589793238462643383279502884197169')]
j = json.dumps(data, cls=PythonObjectEncoder, indent=4)
data2 = json.loads(j, object_hook=as_python_object)
assert data == data2 # both should be same

How to make a class JSON serializable

Do you have an idea about the expected output? For example, will this do?

>>> f  = FileItem("/foo/bar")
>>> magic(f)
'{"fname": "/foo/bar"}'

In that case you can merely call json.dumps(f.__dict__).

If you want more customized output then you will have to subclass JSONEncoder and implement your own custom serialization.

For a trivial example, see below.

>>> from json import JSONEncoder
>>> class MyEncoder(JSONEncoder):
def default(self, o):
return o.__dict__

>>> MyEncoder().encode(f)
'{"fname": "/foo/bar"}'

Then you pass this class into the json.dumps() method as cls kwarg:

json.dumps(cls=MyEncoder)

If you also want to decode then you'll have to supply a custom object_hook to the JSONDecoder class. For example:

>>> def from_json(json_object):
if 'fname' in json_object:
return FileItem(json_object['fname'])
>>> f = JSONDecoder(object_hook = from_json).decode('{"fname": "/foo/bar"}')
>>> f
<__main__.FileItem object at 0x9337fac>
>>>

How to JSON serialize sets?

JSON notation has only a handful of native datatypes (objects, arrays, strings, numbers, booleans, and null), so anything serialized in JSON needs to be expressed as one of these types.

As shown in the json module docs, this conversion can be done automatically by a JSONEncoder and JSONDecoder, but then you would be giving up some other structure you might need (if you convert sets to a list, then you lose the ability to recover regular lists; if you convert sets to a dictionary using dict.fromkeys(s) then you lose the ability to recover dictionaries).

A more sophisticated solution is to build-out a custom type that can coexist with other native JSON types. This lets you store nested structures that include lists, sets, dicts, decimals, datetime objects, etc.:

from json import dumps, loads, JSONEncoder, JSONDecoder
import pickle

class PythonObjectEncoder(JSONEncoder):
def default(self, obj):
try:
return {'_python_object': pickle.dumps(obj).decode('latin-1')}
except pickle.PickleError:
return super().default(obj)

def as_python_object(dct):
if '_python_object' in dct:
return pickle.loads(dct['_python_object'].encode('latin-1'))
return dct

Here is a sample session showing that it can handle lists, dicts, and sets:

>>> data = [1,2,3, set(['knights', 'who', 'say', 'ni']), {'key':'value'}, Decimal('3.14')]

>>> j = dumps(data, cls=PythonObjectEncoder)

>>> loads(j, object_hook=as_python_object)
[1, 2, 3, set(['knights', 'say', 'who', 'ni']), {'key': 'value'}, Decimal('3.14')]

Alternatively, it may be useful to use a more general purpose serialization technique such as YAML, Twisted Jelly, or Python's pickle module. These each support a much greater range of datatypes.

Can not encode a dict to JSON within a custom encoder

You shouldn't define a class method named __dict__, it's a special read-only built-in class attribute, not a method, and you don't need to overload in order to do what you want.

Here's a modified version of your code that show how to do things:

import json

class Klass:
def __init__(self,number):
self.number = number

# Don't do this.
# def __dict__(self):
# return {"number": self.number}

class CustomEncoder(json.JSONEncoder):
def default(self, obj):
if isinstance(obj, Klass):
# obj=obj.__dict__()
return {"number": obj.number}
return json.JSONEncoder.default(self, obj)

json.dumps({"number" : 10}) # works
json.dumps({"number" : 10}, cls=json.JSONEncoder) # works
json.dumps({"number" : 10}, cls=CustomEncoder) # works
json.dumps(Klass(10).__dict__, cls=CustomEncoder) # works
json.dumps({"Test":Klass(10).__dict__}, cls=CustomEncoder) #works

try:
json.dumps(Klass(10), cls=CustomEncoder)
except TypeError as exc:
print(exc)

# this is my end goal to encode a dict of objects
try:
json.dumps({"Test":Klass(10)}, cls=CustomEncoder)
except TypeError as exc:
print(exc)

# this works but clearly it shows me the Custom encoder is not doing what i think it does
encode_hack = {k: v.__dict__ for k, v in {"Test":Klass(10)}.items()}
json.dumps(encode_hack)

Update

An even better way in my opinion to do it would be to rename your __dict__() method to something not reserved and have your custom encoder call it. A major advantage being that now it's now more object-oriented and generic in the sense that any class with a method of that name could also be encoded (and you don't have to hardcode a class name like Klass in your encoder).

Here's what I mean:

import json

class Klass:
def __init__(self,number):
self.number = number

def _to_json(self):
return {"number": self.number}

class CustomEncoder(json.JSONEncoder):
def default(self, obj):
try:
return obj._to_json()
except AttributeError:
return super().default(obj)

# A single object.
try:
print(json.dumps(Klass(10), cls=CustomEncoder)) # -> {"number": 10}
except TypeError as exc:
print(exc)

# A dict of them.
try:
print(json.dumps({"Test":Klass(42)}, cls=CustomEncoder)) # -> {"Test": {"number": 42}}
except TypeError as exc:
print(exc)

How to JSON serialize a python built-in class (eg. int)?

There is no way to serialize a JSON or Python type as JSON value. As described in RFC7159 Section 3 the only available values are:

false / null / true / object / array / number / string

However you could serialize a Python type as a JSON string. For instance, a Python int would become JSON string value "int".

Since Python keyword int is object of type type. You can use __name__ to get its string name. For instance: print(int.__name__).

To automatically encode it, I let you check this answer which use a custom JSONEncoder.

Serializing class instance to JSON

The basic problem is that the JSON encoder json.dumps() only knows how to serialize a limited set of object types by default, all built-in types. List here: https://docs.python.org/3.3/library/json.html#encoders-and-decoders

One good solution would be to make your class inherit from JSONEncoder and then implement the JSONEncoder.default() function, and make that function emit the correct JSON for your class.

A simple solution would be to call json.dumps() on the .__dict__ member of that instance. That is a standard Python dict and if your class is simple it will be JSON serializable.

class Foo(object):
def __init__(self):
self.x = 1
self.y = 2

foo = Foo()
s = json.dumps(foo) # raises TypeError with "is not JSON serializable"

s = json.dumps(foo.__dict__) # s set to: {"x":1, "y":2}

The above approach is discussed in this blog posting:

    Serializing arbitrary Python objects to JSON using _dict_

And, of course, Python offers a built-in function that accesses .__dict__ for you, called vars().

So the above example can also be done as:

s = json.dumps(vars(foo)) # s set to: {"x":1, "y":2}

JSON serialization of dictionary with complex objects

Maybe this can be a starting spot for you. The serializer grabs the __dict__ attribute from the object and makes a new dict-of-dicts, then writes it to JSON. The deserializer creates a dummy object, then updates the __dict__ on the way in.

import json

class PlayerElo:
"""
A class to represent a player in the Elo Rating System
"""
def __init__(self, name: str, id: str, rating):
self.id = id
self.name = name
self.eloratings = {0: 1500}
self.elomatches = {0: 0}
self.initialrating = rating

playersElo={} # dictionary of {<int> : <PlayerElo>}
playersElo[1] = PlayerElo('Joe','123',999)
playersElo[2] = PlayerElo('Bill','456',1999)

def serialize(ratings):
newdict = {i:j.__dict__ for i,j in ratings.items()}
json.dump( newdict, open('x.json','w') )

def deserialize():
o = json.load(open('x.json'))
pe = {}
for k,v in o.items():
obj = PlayerElo('0','0',0)
obj.__dict__.update( v )
pe[int(k)] = obj
return pe

print(playersElo)
serialize( playersElo )
pe = deserialize( )
print(pe)

how to json serialize objects in python

json within python by default can only handle certain objects like dictionaries, list and basic types such as ints, strings and so on for more complex types you need to define your own serialization scheme

>>> help(json)
Extending JSONEncoder::

>>> import json
>>> class ComplexEncoder(json.JSONEncoder):
... def default(self, obj):
... if isinstance(obj, complex):
... return [obj.real, obj.imag]
... return json.JSONEncoder.default(self, obj)
...
>>> dumps(2 + 1j, cls=ComplexEncoder)
'[2.0, 1.0]'
>>> ComplexEncoder().encode(2 + 1j)
'[2.0, 1.0]'
>>> list(ComplexEncoder().iterencode(2 + 1j))
['[', '2.0', ', ', '1.0', ']']


Related Topics



Leave a reply



Submit