Python's json module, converts int dictionary keys to strings
This is one of those subtle differences among various mapping collections that can bite you. JSON treats keys as strings; Python supports distinct keys differing only in type.
In Python (and apparently in Lua) the keys to a mapping (dictionary or table, respectively) are object references. In Python they must be immutable types, or they must be objects which implement a __hash__
method. (The Lua docs suggest that it automatically uses the object's ID as a hash/key even for mutable objects and relies on string interning to ensure that equivalent strings map to the same objects).
In Perl, Javascript, awk and many other languages the keys for hashes, associative arrays or whatever they're called for the given language, are strings (or "scalars" in Perl). In perl $foo{1}, $foo{1.0}, and $foo{"1"}
are all references to the same mapping in %foo
--- the key is evaluated as a scalar!
JSON started as a Javascript serialization technology. (JSON stands for JavaScript Object Notation.) Naturally it implements semantics for its mapping notation which are consistent with its mapping semantics.
If both ends of your serialization are going to be Python then you'd be better off using pickles. If you really need to convert these back from JSON into native Python objects I guess you have a couple of choices. First you could try (try: ... except: ...
) to convert any key to a number in the event of a dictionary look-up failure. Alternatively, if you add code to the other end (the serializer or generator of this JSON data) then you could have it perform a JSON serialization on each of the key values --- providing those as a list of keys. (Then your Python code would first iterate over the list of keys, instantiating/deserializing them into native Python objects ... and then use those for access the values out of the mapping).
Why do int keys of a python dict turn into strings when using json.dumps?
The simple reason is that JSON does not allow integer keys.
object
{}
{ members }
members
pair
pair , members
pair
string : value # Keys *must* be strings.
As to how to get around this limitation - you will first need to ensure that the receiving implementation can handle the technically-invalid JSON. Then you can either replace all of the quote marks or use a custom serializer.
python convert all keys to strings
You'll need to recursively convert all keys; generate a new dictionary with a dict comprehension, that's much easier than altering the keys in-place. You can't add string keys and delete the non-string keys in a dictionary you are iterating over, because that mutates the hash table, which can easily alter the order the dictionary keys are listed in, so this is not permitted.
You should not forget to handle lists; they too can contain further dictionaries.
Whenever I need to transform a nested structure like this, I'd use the @functools.singledispatch
decorator to split out handling for the different container types to different functions:
from functools import singledispatch
@singledispatch
def keys_to_strings(ob):
return ob
@keys_to_strings.register
def _handle_dict(ob: dict):
return {str(k): keys_to_strings(v) for k, v in ob.items()}
@keys_to_strings.register
def _handle_list(ob: list):
return [keys_to_strings(v) for v in ob]
Then JSON encode the result of keys_to_string()
:
json.dumps(keys_to_string(a))
Not that this is all needed. json.dumps()
accepts integer keys natively, turning them to strings. Your input example works without transforming:
json.dumps(a)
From the json.dumps()
documentation:
Note: Keys in key/value pairs of JSON are always of the type
str
. When a dictionary is converted into JSON, all the keys of the dictionary are coerced to strings. As a result of this, if a dictionary is converted into JSON and then back into a dictionary, the dictionary may not equal the original one. That is,loads(dumps(x)) != x
ifx
has non-string keys.
This only applies to types that JSON could otherwise already handle, so None
, booleans, float
and int
objects. For anything else, you'd still get your exception. You probably have an object whose representation is 0
, but it is not a Python int
0:
>>> json.dumps({0: 'works'})
'{"0": "works"}'
>>> import numpy
>>> numpy.int32()
0
>>> json.dumps({numpy.int32(): 'fails'})
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Users/mjpieters/Development/Library/buildout.python/parts/opt/lib/python3.6/json/__init__.py", line 231, in dumps
return _default_encoder.encode(obj)
File "/Users/mjpieters/Development/Library/buildout.python/parts/opt/lib/python3.6/json/encoder.py", line 199, in encode
chunks = self.iterencode(o, _one_shot=True)
File "/Users/mjpieters/Development/Library/buildout.python/parts/opt/lib/python3.6/json/encoder.py", line 257, in iterencode
return _iterencode(o, 0)
TypeError: keys must be a string
I picked a numpy
integer type because that's a commonly confused integer value that is not a Python int
.
A custom encoder, as you added to your post, won't be used for keys; that only applies to values in dictionaries, so if you have non-standard objects for keys, then you indeed still need to use the above recursive solution.
Convert a python dict to a string and back
The json module is a good solution here. It has the advantages over pickle that it only produces plain text output, and is cross-platform and cross-version.
import json
json.dumps(dict)
python 3 dictionary key to a string and value to another string
Use dict.items()
:
You can use dict.items()
(dict.iteritems()
for python 2), it returns pairs of keys and values, and you can simply pick its first.
>>> d = { 'a': 'b' }
>>> key, value = list(d.items())[0]
>>> key
'a'
>>> value
'b'
I converted d.items()
to a list, and picked its 0
index, you can also convert it into an iterator, and pick its first using next
:
>>> key, value = next(iter(d.items()))
>>> key
'a'
>>> value
'b'
Use dict.keys()
and dict.values()
:
You can also use dict.keys()
to retrieve all of the dictionary keys, and pick its first key. And use dict.values()
to retrieve all of the dictionary values:
>>> key = list(d.keys())[0]
>>> key
'a'
>>> value = list(d.values())[0]
>>> value
'b'
Here, you can use next(iter(...))
too:
>>> key = next(iter(d.keys()))
>>> key
'a'
>>> value = next(iter(d.values()))
'b'
Ensure getting a str
:
The above methods don't ensure retrieving a string, they'll return whatever is the actual type of the key, and value. You can explicitly convert them to str
:
>>> d = {'some_key': 1}
>>> key, value = next((str(k), str(v)) for k, v in d.items())
>>> key
'some_key'
>>> value
'1'
>>> type(key)
<class 'str'>
>>> type(value)
<class 'str'>
Now, both key
, and value
are str
. Although actual value in dict was an int
.
Disclaimer: These methods will pick first key, value pair of dictionary if it has multiple key value pairs, and simply ignore others. And it will NOT work if the dictionary is empty. If you need a solution which simply fails if there are multiple values in the dictionary, @SylvainLeroux's answer is the one you should look for.
Related Topics
Subprocess Readline Hangs Waiting for Eof
Random State (Pseudo-Random Number) in Scikit Learn
Importing Modules: _Main_ VS Import as Module
Create a Day-Of-Week Column in a Pandas Dataframe Using Python
Run Child Processes as Different User from a Long Running Python Process
How to Qcut with Non Unique Bin Edges
Classification Using Movie Review Corpus in Nltk/Python
Python: Call a Function from String Name
Upload Files in Google App Engine
Python Dictionary Keys. "In" Complexity
Pythonic Way to Check If a File Exists
Catching an Exception While Using a Python 'With' Statement
When Should I Subclass Enummeta Instead of Enum
How to Load Existing Db File to Memory in Python SQLite3
Python - Add Pythonpath During Command Line Module Run
Python: Find_Element_By_Css_Selector
Difference Between "Findall" and "Find_All" in Beautifulsoup