Numpy Array Is Not JSON Serializable

NumPy array is not JSON serializable

I regularly "jsonify" np.arrays. Try using the ".tolist()" method on the arrays first, like this:

import numpy as np
import codecs, json

a = np.arange(10).reshape(2,5) # a 2 by 5 array
b = a.tolist() # nested lists with same data, indices
file_path = "/path.json" ## your path variable
json.dump(b, codecs.open(file_path, 'w', encoding='utf-8'),
separators=(',', ':'),
sort_keys=True,
indent=4) ### this saves the array in .json format

In order to "unjsonify" the array use:

obj_text = codecs.open(file_path, 'r', encoding='utf-8').read()
b_new = json.loads(obj_text)
a_new = np.array(b_new)

Object of type 'ndarray' is not JSON serializable

Try to convert your ndarray with tolist() method:

prediction = model.predict(np.array(X).tolist()).tolist()
return jsonify({'prediction': prediction})

Example with json package:

a = np.array([1,2,3,4,5]).tolist()
json.dumps({"prediction": a})

That should output:

'{"prediction": [1, 2, 3, 4, 5]}'

Why does converting a Numpy array to JSON fail while converting it to a string succeeds?

I'll try. To answer your question you have to look at how standard Json encoder is implemented. It's here https://github.com/python/cpython/blob/3.10/Lib/json/encoder.py
It certainly can be replaced/extended to include numpy.arrays encoding but the standard class doc says this:

class JSONEncoder(object):

"""Extensible JSON <http://json.org> encoder for Python data structures.
Supports the following objects and types by default:
+-------------------+---------------+
| Python | JSON |
+===================+===============+
| dict | object |
+-------------------+---------------+
| list, tuple | array |
+-------------------+---------------+
| str | string |
+-------------------+---------------+
| int, float | number |
+-------------------+---------------+
| True | true |
+-------------------+---------------+
| False | false |
+-------------------+---------------+
| None | null |
+-------------------+---------------+
To extend this to recognize other objects, subclass and implement a
``.default()`` method with another method that returns a serializable
object for ``o`` if possible, otherwise it should call the superclass
implementation (to raise ``TypeError``).
"""

Why only these classes on the left column are supported?
I can make a few guesses (in no particular order):

  1. Perfomance. Underneath (see function _make_iterencode) there is an iterative if isinstance(...), elif isinstance(...), elif isinstance(...), ... . Numpy array is not a special object that is better than others. I could think of numpy.Matrix, torch.Tensor, pandas.DataFrame that could also need encoding. Various classes from collections like Defaultdict, Counter, NamedTuple. Adding all here?
  2. Irreversibility. Json encode/decode operations ideally should be bijective (reversible). Then how do you mark that it is a numpy array? Use str instead of list? Add a level with a json object where you add field 'type' and say "numpy.array" in it? Makes it ugly and hard to reason about.
  3. Complexity. Numpy arrays are for math operations and there is no guaruanty about what numpy arrays may contain. For example numpy arrays may contain complex numbers or quaternions. Those are not serializable by python json module. Even if you make them serializeble somehow there is no guarantee that other complicated types would not appear in numpy arrays in the future
  4. JSON Standard. Json is a standard format for serialization that is used by many programming languages and very common in web (see https://www.json.org/json-en.html). It has a fixed number of types. Why extending it here in python? For serialization of other objects there are other formats like pickle, dill, parquet and so on - just using what is best suited for a particular problem instead of messing up with a very common format is probably a better solution.

Hope this answers your question. Thank you!

Making numpy arrays JSON serializable

That 3 is a NumPy integer that displays like a regular Python int, but isn't one. Use tolist to get a list of ordinary Python ints:

json.dumps(my_array.tolist())

This will also convert multidimensional arrays to nested lists, so you don't have to deal with a 3-layer list comprehension for a 3-dimensional array.

How to convert dictionary with numpy array to json and back?

numpy arrays cannot be converted into json directly; instead use list.

# Test data
d = {
'chicken': np.random.randn(5),
'banana': np.random.randn(5),
'carrots': np.random.randn(5)
}

# To json
j = json.dumps({k: v.tolist() for k, v in d.items()})

# Back to dict
a = {k: np.array(v) for k, v in json.loads(j).items()}

print (a)
print (d)

Output:

{'banana': array([-0.9936452 ,  0.21594978, -0.24991611,  0.99210387, -0.22347124]),
'carrots': array([-0.7981783 , -1.47840335, -0.00831611, 0.58928124, -0.33779016]),
'chicken': array([-0.03591249, -0.75118824, 0.58297762, 0.5260574 , 0.6391851 ])}

{'banana': array([-0.9936452 , 0.21594978, -0.24991611, 0.99210387, -0.22347124]),
'carrots': array([-0.7981783 , -1.47840335, -0.00831611, 0.58928124, -0.33779016]),
'chicken': array([-0.03591249, -0.75118824, 0.58297762, 0.5260574 , 0.6391851 ])}

ndarray is not JSON serializable TypeError on working script (Tableau Prep - TabPy)

result = pd.DataFrame(prep_input(input).groupby('Store').apply(model1)).reset_index().rename(columns=cnames)
result['Coef'] = result['Coef'].astype('double')

return result

Coef was dtype object, TabPy requires double.



Related Topics



Leave a reply



Submit