Python's JSON Module, Converts Int Dictionary Keys to Strings

Python's json module, converts int dictionary keys to strings

This is one of those subtle differences among various mapping collections that can bite you. JSON treats keys as strings; Python supports distinct keys differing only in type.

In Python (and apparently in Lua) the keys to a mapping (dictionary or table, respectively) are object references. In Python they must be immutable types, or they must be objects which implement a __hash__ method. (The Lua docs suggest that it automatically uses the object's ID as a hash/key even for mutable objects and relies on string interning to ensure that equivalent strings map to the same objects).

In Perl, Javascript, awk and many other languages the keys for hashes, associative arrays or whatever they're called for the given language, are strings (or "scalars" in Perl). In perl $foo{1}, $foo{1.0}, and $foo{"1"} are all references to the same mapping in %foo --- the key is evaluated as a scalar!

JSON started as a Javascript serialization technology. (JSON stands for JavaScript Object Notation.) Naturally it implements semantics for its mapping notation which are consistent with its mapping semantics.

If both ends of your serialization are going to be Python then you'd be better off using pickles. If you really need to convert these back from JSON into native Python objects I guess you have a couple of choices. First you could try (try: ... except: ...) to convert any key to a number in the event of a dictionary look-up failure. Alternatively, if you add code to the other end (the serializer or generator of this JSON data) then you could have it perform a JSON serialization on each of the key values --- providing those as a list of keys. (Then your Python code would first iterate over the list of keys, instantiating/deserializing them into native Python objects ... and then use those for access the values out of the mapping).

Why do int keys of a python dict turn into strings when using json.dumps?

The simple reason is that JSON does not allow integer keys.

object
{}
{ members }
members
pair
pair , members
pair
string : value # Keys *must* be strings.

As to how to get around this limitation - you will first need to ensure that the receiving implementation can handle the technically-invalid JSON. Then you can either replace all of the quote marks or use a custom serializer.

python convert all keys to strings

You'll need to recursively convert all keys; generate a new dictionary with a dict comprehension, that's much easier than altering the keys in-place. You can't add string keys and delete the non-string keys in a dictionary you are iterating over, because that mutates the hash table, which can easily alter the order the dictionary keys are listed in, so this is not permitted.

You should not forget to handle lists; they too can contain further dictionaries.

Whenever I need to transform a nested structure like this, I'd use the @functools.singledispatch decorator to split out handling for the different container types to different functions:

from functools import singledispatch

@singledispatch
def keys_to_strings(ob):
return ob

@keys_to_strings.register
def _handle_dict(ob: dict):
return {str(k): keys_to_strings(v) for k, v in ob.items()}

@keys_to_strings.register
def _handle_list(ob: list):
return [keys_to_strings(v) for v in ob]

Then JSON encode the result of keys_to_string():

json.dumps(keys_to_string(a))

Not that this is all needed. json.dumps() accepts integer keys natively, turning them to strings. Your input example works without transforming:

json.dumps(a)

From the json.dumps() documentation:

Note: Keys in key/value pairs of JSON are always of the type str. When a dictionary is converted into JSON, all the keys of the dictionary are coerced to strings. As a result of this, if a dictionary is converted into JSON and then back into a dictionary, the dictionary may not equal the original one. That is, loads(dumps(x)) != x if x has non-string keys.

This only applies to types that JSON could otherwise already handle, so None, booleans, float and int objects. For anything else, you'd still get your exception. You probably have an object whose representation is 0, but it is not a Python int 0:

>>> json.dumps({0: 'works'})
'{"0": "works"}'
>>> import numpy
>>> numpy.int32()
0
>>> json.dumps({numpy.int32(): 'fails'})
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Users/mjpieters/Development/Library/buildout.python/parts/opt/lib/python3.6/json/__init__.py", line 231, in dumps
return _default_encoder.encode(obj)
File "/Users/mjpieters/Development/Library/buildout.python/parts/opt/lib/python3.6/json/encoder.py", line 199, in encode
chunks = self.iterencode(o, _one_shot=True)
File "/Users/mjpieters/Development/Library/buildout.python/parts/opt/lib/python3.6/json/encoder.py", line 257, in iterencode
return _iterencode(o, 0)
TypeError: keys must be a string

I picked a numpy integer type because that's a commonly confused integer value that is not a Python int.

A custom encoder, as you added to your post, won't be used for keys; that only applies to values in dictionaries, so if you have non-standard objects for keys, then you indeed still need to use the above recursive solution.

Convert a python dict to a string and back

The json module is a good solution here. It has the advantages over pickle that it only produces plain text output, and is cross-platform and cross-version.

import json
json.dumps(dict)

python 3 dictionary key to a string and value to another string

Use dict.items():

You can use dict.items() (dict.iteritems() for python 2), it returns pairs of keys and values, and you can simply pick its first.

>>> d = { 'a': 'b' }
>>> key, value = list(d.items())[0]
>>> key
'a'
>>> value
'b'

I converted d.items() to a list, and picked its 0 index, you can also convert it into an iterator, and pick its first using next:

>>> key, value = next(iter(d.items()))
>>> key
'a'
>>> value
'b'

Use dict.keys() and dict.values():

You can also use dict.keys() to retrieve all of the dictionary keys, and pick its first key. And use dict.values() to retrieve all of the dictionary values:

>>> key = list(d.keys())[0]
>>> key
'a'
>>> value = list(d.values())[0]
>>> value
'b'

Here, you can use next(iter(...)) too:

>>> key = next(iter(d.keys()))
>>> key
'a'
>>> value = next(iter(d.values()))
'b'

Ensure getting a str:

The above methods don't ensure retrieving a string, they'll return whatever is the actual type of the key, and value. You can explicitly convert them to str:

>>> d = {'some_key': 1}
>>> key, value = next((str(k), str(v)) for k, v in d.items())
>>> key
'some_key'
>>> value
'1'
>>> type(key)
<class 'str'>
>>> type(value)
<class 'str'>

Now, both key, and value are str. Although actual value in dict was an int.

Disclaimer: These methods will pick first key, value pair of dictionary if it has multiple key value pairs, and simply ignore others. And it will NOT work if the dictionary is empty. If you need a solution which simply fails if there are multiple values in the dictionary, @SylvainLeroux's answer is the one you should look for.



Related Topics



Leave a reply



Submit