How to convert a nested Python dict to object?
Update: In Python 2.6 and onwards, consider whether the namedtuple
data structure suits your needs:
>>> from collections import namedtuple
>>> MyStruct = namedtuple('MyStruct', 'a b d')
>>> s = MyStruct(a=1, b={'c': 2}, d=['hi'])
>>> s
MyStruct(a=1, b={'c': 2}, d=['hi'])
>>> s.a
1
>>> s.b
{'c': 2}
>>> s.c
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'MyStruct' object has no attribute 'c'
>>> s.d
['hi']
The alternative (original answer contents) is:
class Struct:
def __init__(self, **entries):
self.__dict__.update(entries)
Then, you can use:
>>> args = {'a': 1, 'b': 2}
>>> s = Struct(**args)
>>> s
<__main__.Struct instance at 0x01D6A738>
>>> s.a
1
>>> s.b
2
What is the most economical way to convert nested Python objects to dictionaries?
I am not sure if I understood exactly what you want - but if I got, this function can do what you want:
It does search recursively on an object's attributes, yielding a nested dictionary + list structure, with the ending points being python objects not having a __dict__ attribute - which in SQLAlchemy's case are likely to be basic Python types like numbers and strings. (If that fails, replacing the "hasattr dict" test for soemthing more sensible should fix the code for your needs.
def my_dict(obj):
if not hasattr(obj,"__dict__"):
return obj
result = {}
for key, val in obj.__dict__.items():
if key.startswith("_"):
continue
element = []
if isinstance(val, list):
for item in val:
element.append(my_dict(item))
else:
element = my_dict(val)
result[key] = element
return result
How to convert a nested python dictionary into a simple namespace?
2022 answer: now there is a tiny, relatively fast library I have published, called dotwiz
, which alternatively can be used to provide easy dot access for a python dict
object.
It should, coincidentally, be a little faster than the other options -- I've added a quick and dirty benchmark code I put together using the timeit
module below, timing against both a attrdict
and SimpleNamespace
approach -- the latter of which actually performs pretty solid in times.
Note that I had to modify the
parse
function slightly, so that it handles nesteddict
s within alist
object, for example.
from timeit import timeit
from types import SimpleNamespace
from attrdict import AttrDict
from dotwiz import DotWiz
example_input = {'key0a': "test", 'key0b': {'key1a': [{'key2a': 'end', 'key2b': "test"}], 'key1b': "test"},
"something": "else"}
def parse(d):
x = SimpleNamespace()
_ = [setattr(x, k,
parse(v) if isinstance(v, dict)
else [parse(e) for e in v] if isinstance(v, list)
else v) for k, v in d.items()]
return x
print('-- Create')
print('attrdict: ', round(timeit('AttrDict(example_input)', globals=globals()), 2))
print('dotwiz: ', round(timeit('DotWiz(example_input)', globals=globals()), 2))
print('SimpleNamespace: ', round(timeit('parse(example_input)', globals=globals()), 2))
print()
dw = DotWiz(example_input)
ns = parse(example_input)
ad = AttrDict(example_input)
print('-- Get')
print('attrdict: ', round(timeit('ad.key0b.key1a[0].key2a', globals=globals()), 2))
print('dotwiz: ', round(timeit('dw.key0b.key1a[0].key2a', globals=globals()), 2))
print('SimpleNamespace: ', round(timeit('ns.key0b.key1a[0].key2a', globals=globals()), 2))
print()
print(ad)
print(dw)
print(ns)
assert ad.key0b.key1a[0].key2a \
== dw.key0b.key1a[0].key2a \
== ns.key0b.key1a[0].key2a \
== 'end'
Here are the results, on my M1 Mac Pro laptop:
attrdict: 0.69
dotwiz: 1.3
SimpleNamespace: 1.38
-- Get
attrdict: 6.06
dotwiz: 0.06
SimpleNamespace: 0.06
The dotwiz library can be installed with pip
:
$ pip install dotwiz
Complex transforming nested dictionaries into objects in python
How about:
class ComboParser(object):
def __init__(self,data):
self.data=data
def __getattr__(self,key):
try:
return ComboParser(self.data[key])
except TypeError:
result=[]
for item in self.data:
if key in item:
try:
result.append(item[key])
except TypeError: pass
return ComboParser(result)
def __getitem__(self,key):
return ComboParser(self.data[key])
def __iter__(self):
if isinstance(self.data,basestring):
# self.data might be a str or unicode object
yield self.data
else:
# self.data might be a list or tuple
try:
for item in self.data:
yield item
except TypeError:
# self.data might be an int or float
yield self.data
def __length_hint__(self):
return len(self.data)
which yields:
combination = {
'item1': 3.14,
'item2': 42,
'items': [
'text text text',
{
'field1': 'a',
'field2': 'b',
},
{
'field1': 'c',
'field2': 'd',
},
{
'field1': 'e',
'field3': 'f',
},
]
}
print(list(ComboParser(combination).item1))
# [3.1400000000000001]
print(list(ComboParser(combination).items))
# ['text text text', {'field2': 'b', 'field1': 'a'}, {'field2': 'd', 'field1': 'c'}, {'field3': 'f', 'field1': 'e'}]
print(list(ComboParser(combination).items[0]))
# ['text text text']
print(list(ComboParser(combination).items.field1))
# ['a', 'c', 'e']
Converting a nested dict to Python object
Maybe a recursive method like this -
>>> class sample_token:
... def __init__(self, **response):
... for k,v in response.items():
... if isinstance(v,dict):
... self.__dict__[k] = sample_token(**v)
... else:
... self.__dict__[k] = v
...
>>> s = sample_token(**response_body)
>>> s.sub
<__main__.sample_token object at 0x02CEA530>
>>> s.sub.cn
'Gandalf Grey'
We go over each key:value
pair in the response, and if value is a dictionary we create a sample_token object for that and put that new object in the __dict__()
.
How do I convert this complex nested Dict into Pandas
Starting with the results dictionary
result = {'mlcSongCode': 'A6457V',
'primaryTitle': 'AIR FORCE ONES',
'membersSongId': '',
'artists': 'TRACK | NELLY, MURPHY LEE, ALI, KYJUAN, TRACK BOYZ',
'propertyId': None,
'akas': [{'akaId': '', 'akaTitle': '', 'akaTitleTypeCode': ''}],
'writers': [{'writerId': '1083561',
'writerLastName': 'SMITH',
'writerFirstName': 'PREMRO VONZELLAIRE',
'writerIPI': '00232478669',
'writerRoleCode': 'ComposerLyricist',
'chainId': 'PSC_337535223',
'chainParentId': ''},
{'writerId': '1858916',
'writerLastName': 'GOODWIN',
'writerFirstName': 'MARLON',
'writerIPI': '',
'writerRoleCode': 'ComposerLyricist',
'chainId': 'PSC_337535224',
'chainParentId': ''},
{'writerId': '1883205',
'writerLastName': 'HAYNES',
'writerFirstName': 'CORNELL',
'writerIPI': '',
'writerRoleCode': 'ComposerLyricist',
'chainId': 'PSC_337535225',
'chainParentId': ''},
{'writerId': '4733138',
'writerLastName': 'LAVELLE',
'writerFirstName': 'CRUMP',
'writerIPI': '',
'writerRoleCode': 'ComposerLyricist',
'chainId': 'PSC_337535226',
'chainParentId': ''}],
'publishers': [{'publisherId': '910354',
'mlcPublisherNumber': None,
'publisherName': 'TENYOR MUSIC',
'publisherIpiNumber': '00263286262',
'publisherRoleCode': 'OriginalPublisher',
'collectionShare': 16.67,
'chainId': 'PSA_311720187',
'chainParentId': 'PSC_311915511',
'administrators': [],
'parentPublishers': [{'publisherId': '377508',
'mlcPublisherNumber': None,
'publisherName': 'ALL MY PUBLISHING LLC',
'publisherIpiNumber': '',
'publisherRoleCode': 'OriginalPublisher',
'collectionShare': 0,
'chainId': 'PSC_311915511',
'chainParentId': 'PSC_337535223|PSC_337535224|PSC_337535225|PSC_337535226',
'administrators': [],
'parentPublishers': []}]},
{'publisherId': '716372',
'mlcPublisherNumber': None,
'publisherName': 'KOBALT MUSIC PUB AMERICA INC',
'publisherIpiNumber': '00503659557',
'publisherRoleCode': 'SubPublisher',
'collectionShare': 50,
'chainId': 'PSA_365023093',
'chainParentId': 'PSC_337535222',
'administrators': [],
'parentPublishers': [{'publisherId': '631204',
'mlcPublisherNumber': None,
'publisherName': 'TARPO MUSIC PUB.',
'publisherIpiNumber': '00419823444',
'publisherRoleCode': 'OriginalPublisher',
'collectionShare': 0,
'chainId': 'PSC_337535222',
'chainParentId': '',
'administrators': [],
'parentPublishers': []}]}],
'iswc': ''}
Load it into a dataframe:
import pandas as pd
df = pd.json_normalize(result)
This gives a dataframe with each key of results as a column, and the value of the key as the column value. In this case, the columns are mlcSongCode primaryTitle membersSongId artists propertyId akas writers publishers iswc
Explode the writers
column:
df = df.explode('writers').reset_index(drop=True)
This converts each element in the writers
array into a row, giving you a dataframe with one row for each 'writer'
Normalize the writers
JSON into a flat table. This takes the JSON for each 'writer' and expands each key of it into a column. E.g. it will generate a column for 'writerLastName', 'writerFirstName' etc
normalized = pd.json_normalize(df['writers'])
Join the normalized dataframe to the original dataframe, and remove the original 'writers' column:
df = df.join(normalized).drop(columns=['writers'])
Then repeat with the other JSON columns as needed
how to convert add another level to nested dictionary which has a tuple as a key in python
After observing the problem description, it seems that the structure of the data provided by OP is very fixed (and more general recursive structure have many limitations, because it is necessary to avoid that the deepest value is both a dictionary and a list), so the recursive scheme is abandoned here and the loop is hard coded:
def make_nested(mp):
res = {}
for k, v in mp.items():
res[k] = new_val = {}
for (vk1, vk2), vv in v.items():
new_val.setdefault(vk1, {})[vk2] = vv
return res
Test:
>>> mapping
{'a': {('a', 'b'): ['c', 'd'],
('a', 'c'): ['d', 'f', 'g'],
('c', 'k'): ['f', 'h'],
('c', 'j'): [],
('h', 'z'): ['w']}}
>>> make_nested(mapping)
{'a': {'a': {'b': ['c', 'd'], 'c': ['d', 'f', 'g']},
'c': {'k': ['f', 'h'], 'j': []},
'h': {'z': ['w']}}}
Related Topics
Why Does This Code for Initializing a List of Lists Apparently Link the Lists Together
How to Hide the Console When I Use Os.System() or Subprocess.Call()
CSV in Python Adding an Extra Carriage Return, on Windows
Checking If a String Can Be Converted to Float in Python
Class Method Differences in Python: Bound, Unbound and Static
Live Output from Subprocess Command
What Should I Do with "Unexpected Indent" in Python
Pg_Config Executable Not Found
Having Django Serve Downloadable Files
Setting Y-Axis Limit in Matplotlib
Keras, How to Get the Output of Each Layer
Read Specific Columns from a CSV File with CSV Module
I Can't Install Pyaudio on Windows? How to Solve "Error: Microsoft Visual C++ 14.0 Is Required."
How to Set Time Limit on Raw_Input