How to Convert a Nested Python Dict to Object

How to convert a nested Python dict to object?

Update: In Python 2.6 and onwards, consider whether the namedtuple data structure suits your needs:

>>> from collections import namedtuple
>>> MyStruct = namedtuple('MyStruct', 'a b d')
>>> s = MyStruct(a=1, b={'c': 2}, d=['hi'])
>>> s
MyStruct(a=1, b={'c': 2}, d=['hi'])
>>> s.a
1
>>> s.b
{'c': 2}
>>> s.c
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'MyStruct' object has no attribute 'c'
>>> s.d
['hi']

The alternative (original answer contents) is:

class Struct:
def __init__(self, **entries):
self.__dict__.update(entries)

Then, you can use:

>>> args = {'a': 1, 'b': 2}
>>> s = Struct(**args)
>>> s
<__main__.Struct instance at 0x01D6A738>
>>> s.a
1
>>> s.b
2

What is the most economical way to convert nested Python objects to dictionaries?

I am not sure if I understood exactly what you want - but if I got, this function can do what you want:
It does search recursively on an object's attributes, yielding a nested dictionary + list structure, with the ending points being python objects not having a __dict__ attribute - which in SQLAlchemy's case are likely to be basic Python types like numbers and strings. (If that fails, replacing the "hasattr dict" test for soemthing more sensible should fix the code for your needs.

def my_dict(obj):
if not hasattr(obj,"__dict__"):
return obj
result = {}
for key, val in obj.__dict__.items():
if key.startswith("_"):
continue
element = []
if isinstance(val, list):
for item in val:
element.append(my_dict(item))
else:
element = my_dict(val)
result[key] = element
return result

How to convert a nested python dictionary into a simple namespace?

2022 answer: now there is a tiny, relatively fast library I have published, called dotwiz, which alternatively can be used to provide easy dot access for a python dict object.

It should, coincidentally, be a little faster than the other options -- I've added a quick and dirty benchmark code I put together using the timeit module below, timing against both a attrdict and SimpleNamespace approach -- the latter of which actually performs pretty solid in times.

Note that I had to modify the parse function slightly, so that it handles nested dicts within a list object, for example.

from timeit import timeit
from types import SimpleNamespace

from attrdict import AttrDict
from dotwiz import DotWiz

example_input = {'key0a': "test", 'key0b': {'key1a': [{'key2a': 'end', 'key2b': "test"}], 'key1b': "test"},
"something": "else"}

def parse(d):
x = SimpleNamespace()
_ = [setattr(x, k,
parse(v) if isinstance(v, dict)
else [parse(e) for e in v] if isinstance(v, list)
else v) for k, v in d.items()]
return x

print('-- Create')
print('attrdict: ', round(timeit('AttrDict(example_input)', globals=globals()), 2))
print('dotwiz: ', round(timeit('DotWiz(example_input)', globals=globals()), 2))
print('SimpleNamespace: ', round(timeit('parse(example_input)', globals=globals()), 2))
print()

dw = DotWiz(example_input)
ns = parse(example_input)
ad = AttrDict(example_input)

print('-- Get')
print('attrdict: ', round(timeit('ad.key0b.key1a[0].key2a', globals=globals()), 2))
print('dotwiz: ', round(timeit('dw.key0b.key1a[0].key2a', globals=globals()), 2))
print('SimpleNamespace: ', round(timeit('ns.key0b.key1a[0].key2a', globals=globals()), 2))
print()

print(ad)
print(dw)
print(ns)

assert ad.key0b.key1a[0].key2a \
== dw.key0b.key1a[0].key2a \
== ns.key0b.key1a[0].key2a \
== 'end'

Here are the results, on my M1 Mac Pro laptop:

attrdict:          0.69
dotwiz: 1.3
SimpleNamespace: 1.38

-- Get
attrdict: 6.06
dotwiz: 0.06
SimpleNamespace: 0.06

The dotwiz library can be installed with pip:

$ pip install dotwiz

Complex transforming nested dictionaries into objects in python

How about:

class ComboParser(object):
def __init__(self,data):
self.data=data
def __getattr__(self,key):
try:
return ComboParser(self.data[key])
except TypeError:
result=[]
for item in self.data:
if key in item:
try:
result.append(item[key])
except TypeError: pass
return ComboParser(result)
def __getitem__(self,key):
return ComboParser(self.data[key])
def __iter__(self):
if isinstance(self.data,basestring):
# self.data might be a str or unicode object
yield self.data
else:
# self.data might be a list or tuple
try:
for item in self.data:
yield item
except TypeError:
# self.data might be an int or float
yield self.data
def __length_hint__(self):
return len(self.data)

which yields:

combination = {
'item1': 3.14,
'item2': 42,
'items': [
'text text text',
{
'field1': 'a',
'field2': 'b',
},
{
'field1': 'c',
'field2': 'd',
},
{
'field1': 'e',
'field3': 'f',
},
]
}
print(list(ComboParser(combination).item1))
# [3.1400000000000001]
print(list(ComboParser(combination).items))
# ['text text text', {'field2': 'b', 'field1': 'a'}, {'field2': 'd', 'field1': 'c'}, {'field3': 'f', 'field1': 'e'}]
print(list(ComboParser(combination).items[0]))
# ['text text text']
print(list(ComboParser(combination).items.field1))
# ['a', 'c', 'e']

Converting a nested dict to Python object

Maybe a recursive method like this -

>>> class sample_token:
... def __init__(self, **response):
... for k,v in response.items():
... if isinstance(v,dict):
... self.__dict__[k] = sample_token(**v)
... else:
... self.__dict__[k] = v
...
>>> s = sample_token(**response_body)
>>> s.sub
<__main__.sample_token object at 0x02CEA530>
>>> s.sub.cn
'Gandalf Grey'

We go over each key:value pair in the response, and if value is a dictionary we create a sample_token object for that and put that new object in the __dict__() .

How do I convert this complex nested Dict into Pandas

Starting with the results dictionary

result = {'mlcSongCode': 'A6457V',
'primaryTitle': 'AIR FORCE ONES',
'membersSongId': '',
'artists': 'TRACK | NELLY, MURPHY LEE, ALI, KYJUAN, TRACK BOYZ',
'propertyId': None,
'akas': [{'akaId': '', 'akaTitle': '', 'akaTitleTypeCode': ''}],
'writers': [{'writerId': '1083561',
'writerLastName': 'SMITH',
'writerFirstName': 'PREMRO VONZELLAIRE',
'writerIPI': '00232478669',
'writerRoleCode': 'ComposerLyricist',
'chainId': 'PSC_337535223',
'chainParentId': ''},
{'writerId': '1858916',
'writerLastName': 'GOODWIN',
'writerFirstName': 'MARLON',
'writerIPI': '',
'writerRoleCode': 'ComposerLyricist',
'chainId': 'PSC_337535224',
'chainParentId': ''},
{'writerId': '1883205',
'writerLastName': 'HAYNES',
'writerFirstName': 'CORNELL',
'writerIPI': '',
'writerRoleCode': 'ComposerLyricist',
'chainId': 'PSC_337535225',
'chainParentId': ''},
{'writerId': '4733138',
'writerLastName': 'LAVELLE',
'writerFirstName': 'CRUMP',
'writerIPI': '',
'writerRoleCode': 'ComposerLyricist',
'chainId': 'PSC_337535226',
'chainParentId': ''}],
'publishers': [{'publisherId': '910354',
'mlcPublisherNumber': None,
'publisherName': 'TENYOR MUSIC',
'publisherIpiNumber': '00263286262',
'publisherRoleCode': 'OriginalPublisher',
'collectionShare': 16.67,
'chainId': 'PSA_311720187',
'chainParentId': 'PSC_311915511',
'administrators': [],
'parentPublishers': [{'publisherId': '377508',
'mlcPublisherNumber': None,
'publisherName': 'ALL MY PUBLISHING LLC',
'publisherIpiNumber': '',
'publisherRoleCode': 'OriginalPublisher',
'collectionShare': 0,
'chainId': 'PSC_311915511',
'chainParentId': 'PSC_337535223|PSC_337535224|PSC_337535225|PSC_337535226',
'administrators': [],
'parentPublishers': []}]},
{'publisherId': '716372',
'mlcPublisherNumber': None,
'publisherName': 'KOBALT MUSIC PUB AMERICA INC',
'publisherIpiNumber': '00503659557',
'publisherRoleCode': 'SubPublisher',
'collectionShare': 50,
'chainId': 'PSA_365023093',
'chainParentId': 'PSC_337535222',
'administrators': [],
'parentPublishers': [{'publisherId': '631204',
'mlcPublisherNumber': None,
'publisherName': 'TARPO MUSIC PUB.',
'publisherIpiNumber': '00419823444',
'publisherRoleCode': 'OriginalPublisher',
'collectionShare': 0,
'chainId': 'PSC_337535222',
'chainParentId': '',
'administrators': [],
'parentPublishers': []}]}],
'iswc': ''}

Load it into a dataframe:

import pandas as pd
df = pd.json_normalize(result)

This gives a dataframe with each key of results as a column, and the value of the key as the column value. In this case, the columns are mlcSongCode primaryTitle membersSongId artists propertyId akas writers publishers iswc

Explode the writers column:

df = df.explode('writers').reset_index(drop=True)

This converts each element in the writers array into a row, giving you a dataframe with one row for each 'writer'

Normalize the writers JSON into a flat table. This takes the JSON for each 'writer' and expands each key of it into a column. E.g. it will generate a column for 'writerLastName', 'writerFirstName' etc

normalized = pd.json_normalize(df['writers'])

Join the normalized dataframe to the original dataframe, and remove the original 'writers' column:

df = df.join(normalized).drop(columns=['writers'])

Then repeat with the other JSON columns as needed

how to convert add another level to nested dictionary which has a tuple as a key in python

After observing the problem description, it seems that the structure of the data provided by OP is very fixed (and more general recursive structure have many limitations, because it is necessary to avoid that the deepest value is both a dictionary and a list), so the recursive scheme is abandoned here and the loop is hard coded:

def make_nested(mp):
res = {}
for k, v in mp.items():
res[k] = new_val = {}
for (vk1, vk2), vv in v.items():
new_val.setdefault(vk1, {})[vk2] = vv
return res

Test:

>>> mapping
{'a': {('a', 'b'): ['c', 'd'],
('a', 'c'): ['d', 'f', 'g'],
('c', 'k'): ['f', 'h'],
('c', 'j'): [],
('h', 'z'): ['w']}}
>>> make_nested(mapping)
{'a': {'a': {'b': ['c', 'd'], 'c': ['d', 'f', 'g']},
'c': {'k': ['f', 'h'], 'j': []},
'h': {'z': ['w']}}}


Related Topics



Leave a reply



Submit