Construct Pandas Dataframe from Items in Nested Dictionary

Construct pandas DataFrame from items in nested dictionary

A pandas MultiIndex consists of a list of tuples. So the most natural approach would be to reshape your input dict so that its keys are tuples corresponding to the multi-index values you require. Then you can just construct your dataframe using pd.DataFrame.from_dict, using the option orient='index':

user_dict = {12: {'Category 1': {'att_1': 1, 'att_2': 'whatever'},
                  'Category 2': {'att_1': 23, 'att_2': 'another'}},
             15: {'Category 1': {'att_1': 10, 'att_2': 'foo'},
                  'Category 2': {'att_1': 30, 'att_2': 'bar'}}}

pd.DataFrame.from_dict({(i,j): user_dict[i][j] 
                           for i in user_dict.keys() 
                           for j in user_dict[i].keys()},
                       orient='index')

               att_1     att_2
12 Category 1      1  whatever
   Category 2     23   another
15 Category 1     10       foo
   Category 2     30       bar

An alternative approach would be to build your dataframe up by concatenating the component dataframes:

user_ids = []
frames = []

for user_id, d in user_dict.iteritems():
    user_ids.append(user_id)
    frames.append(pd.DataFrame.from_dict(d, orient='index'))

pd.concat(frames, keys=user_ids)

               att_1     att_2
12 Category 1      1  whatever
   Category 2     23   another
15 Category 1     10       foo
   Category 2     30       bar

Construct a pandas DataFrame from items in a nested dictionary with lists as inner values

You could do:

df =  pd.DataFrame(
    (
        [subkey, key] + value
        for key, records in annot_dict.items()
        for record in records
        for subkey, value in record.items()
    ),
    columns=[
        'subunit_ID', 'gene_ID', 'start_index', 'end_index', 'strand','biotype', 'desc'
    ]
)

Result for

annot_dict = {
    'ID_string1': [
        {'ID_string1': ['attr11a', 'attr11b', 'attr11c', 'attr11d', 'attr11e']},
        {'string12'  : ['attr12a', 'attr12b', 'attr12c', 'attr12d', 'attr12e']},
        {'string13'  : ['attr13a', 'attr13b', 'attr13c', 'attr13d', 'attr13e']},
    ],
    'ID_string2': [
        {'ID_string2': ['attr21a', 'attr21b', 'attr21c', 'attr21d', 'attr21e']},
        {'string22'  : ['attr22a', 'attr22b', 'attr22c', 'attr22d', 'attr22e']},
        {'string23'  : ['attr23a', 'attr23b', 'attr23c', 'attr23d', 'attr23e']},
    ]
}

   subunit_ID     gene_ID start_index end_index   strand  biotype     desc
0  ID_string1  ID_string1     attr11a   attr11b  attr11c  attr11d  attr11e
1    string12  ID_string1     attr12a   attr12b  attr12c  attr12d  attr12e
2    string13  ID_string1     attr13a   attr13b  attr13c  attr13d  attr13e
3  ID_string2  ID_string2     attr21a   attr21b  attr21c  attr21d  attr21e
4    string22  ID_string2     attr22a   attr22b  attr22c  attr22d  attr22e
5    string23  ID_string2     attr23a   attr23b  attr23c  attr23d  attr23e

How to convert a nested dictionary with lists to a dataframe in this format

You can use stack and explode:

import pandas as pd

nested_dict = { 'Girl': {'June': [45, 32], 'Samantha': [14, 34, 65]},
                'Boy': {'Brad': [12, 54, 12], 'Chad': [12]}}

df = pd.DataFrame.from_dict(nested_dict, orient='index')
print(df.stack().explode())

Output:

Girl  June        45
      June        32
      Samantha    14
      Samantha    34
      Samantha    65
Boy   Brad        12
      Brad        54
      Brad        12
      Chad        12

How to create a pandas dataframe from a nested dictionary with lists of dictionaries?

One option would be to merge the lists of dicts into a single dict then build a DataFrame.from_dict:

import pandas as pd
from collections import ChainMap

dictionary = {'user1': [{'product1': 10}, {'product2': 15}, {'product3': 20}],
              'user2': [{'product1': 13}, {'product2': 8}, {'product3': 50}]}

df = pd.DataFrame.from_dict(
    {k: dict(ChainMap(*v)) for k, v in dictionary.items()},
    orient='index'
)

df:

       product3  product2  product1
user1        20        15        10
user2        50         8        13

Optional alphanumeric sort with natsort:

from natsort import natsorted

df = df.reindex(columns=natsorted(df.columns))

       product1  product2  product3
user1        10        15        20
user2        13         8        50

{k: dict(ChainMap(*v)) for k, v in dictionary.items()}

{'user1': {'product3': 20, 'product2': 15, 'product1': 10},
 'user2': {'product3': 50, 'product2': 8, 'product1': 13}}

Python Pandas: Convert nested dictionary to dataframe

Try DataFrame.from_dict() and with keyword argument orient as 'index' -

Example -

In [20]: d = {1 : {'tp': 26, 'fp': 112},
   ....: 2 : {'tp': 26, 'fp': 91},
   ....: 3 : {'tp': 23, 'fp': 74}}

In [24]: df =pd.DataFrame.from_dict(d,orient='index')

In [25]: df
Out[25]:
   tp   fp
1  26  112
2  26   91
3  23   74

If you also want to set the column name for index column , use - df.index.name , Example -

In [30]: df.index.name = 't'

In [31]: df
Out[31]:
   tp   fp
t
1  26  112
2  26   91
3  23   74

How to convert a nested dict, to a pandas dataframe

Loading a JSON/dict:

Using .json_normalized to expand the dict.

import pandas as pd

data = {'id': 3241234, 'data': {'name': 'carol', 'lastname': 'netflik', 'office': {'num': 3543, 'department': 'trigy'}}}

df = pd.json_normalize(data)

# display(df)
        id data.name data.lastname  data.office.num data.office.department
0  3241234     carol       netflik             3543                  trigy

If the dataframe has column of `dicts`

Also see this answer, to this SO: Split / Explode a column of dictionaries into separate columns with pandas

# dataframe with column of dicts
df = pd.DataFrame({'col2': [1, 2, 3], 'col': [data, data, data]})

# display(df)
   col2                                                                                                                col
0     1  {'id': 3241234, 'data': {'name': 'carol', 'lastname': 'netflik', 'office': {'num': 3543, 'department': 'trigy'}}}
1     2  {'id': 3241234, 'data': {'name': 'carol', 'lastname': 'netflik', 'office': {'num': 3543, 'department': 'trigy'}}}
2     3  {'id': 3241234, 'data': {'name': 'carol', 'lastname': 'netflik', 'office': {'num': 3543, 'department': 'trigy'}}}

# normalize the column of dicts
normalized = pd.json_normalize(df['col'])

# join the normalized column to df
df = df.join(normalized).drop(columns=['col'])

# display(df)
   col2       id data.name data.lastname  data.office.num data.office.department
0     1  3241234     carol       netflik             3543                  trigy
1     2  3241234     carol       netflik             3543                  trigy
2     3  3241234     carol       netflik             3543                  trigy

If the dataframe has a column of `lists` with `dicts`

The dicts need to be removed from the lists with .explode

data = [{'id': 3241234, 'data': {'name': 'carol', 'lastname': 'netflik', 'office': {'num': 3543, 'department': 'trigy'}}}]

df = pd.DataFrame({'col2': [1, 2, 3], 'col': [data, data, data]})

# display(df)
   col2                                                                                                                  col
0     1  [{'id': 3241234, 'data': {'name': 'carol', 'lastname': 'netflik', 'office': {'num': 3543, 'department': 'trigy'}}}]
1     2  [{'id': 3241234, 'data': {'name': 'carol', 'lastname': 'netflik', 'office': {'num': 3543, 'department': 'trigy'}}}]
2     3  [{'id': 3241234, 'data': {'name': 'carol', 'lastname': 'netflik', 'office': {'num': 3543, 'department': 'trigy'}}}]

# explode the lists
df = df.explode('col', ignore_index=True)

# remove and normalize the column of dicts
normalized = pd.json_normalize(df.pop('col'))

# join the normalized column to df
df = df.join(normalized)

Pandas Dataframe from nested dictionary of pandas dataframes

Idea is create tuples by both keys and pass to concat, third level of MultiIndex is created from index values of original DataFrames, if necessary you can remove it:

my_dict = {
           'elem1':{'day1': pd.DataFrame(1, columns=['Col1', 'Col2'], index=[1,2]),
                    'day2': pd.DataFrame(2, columns=['Col1', 'Col2'], index=[1,2])
                   },
           'elem2':{'day1': pd.DataFrame(3, columns=['Col1', 'Col2'], index=[1,2]),
                    'day2': pd.DataFrame(4, columns=['Col1', 'Col2'], index=[1,2]),
                    'day3': pd.DataFrame(5, columns=['Col1', 'Col2'], index=[1,2])
                   }
          }

d = {(k1, k2): v2 for k1, v1 in my_dict.items() for k2, v2 in v1.items()}
print (d)
{('elem1', 'day1'):    Col1  Col2
1     1     1
2     1     1, ('elem1', 'day2'):    Col1  Col2
1     2     2
2     2     2, ('elem2', 'day1'):    Col1  Col2
1     3     3
2     3     3, ('elem2', 'day2'):    Col1  Col2
1     4     4
2     4     4, ('elem2', 'day3'):    Col1  Col2
1     5     5
2     5     5}

df = pd.concat(d, sort=False)
print (df)
              Col1  Col2
elem1 day1 1     1     1
           2     1     1
      day2 1     2     2
           2     2     2
elem2 day1 1     3     3
           2     3     3
      day2 1     4     4
           2     4     4
      day3 1     5     5
           2     5     5

df = pd.concat(d, sort=False).reset_index(level=2, drop=True)
print (df)
            Col1  Col2
elem1 day1     1     1
      day1     1     1
      day2     2     2
      day2     2     2
elem2 day1     3     3
      day1     3     3
      day2     4     4
      day2     4     4
      day3     5     5
      day3     5     5

Pandas: transforming dataframe to nested dictionary

You can group your dataframe by all columns except price, then create your dictionaries in a loop:

# if more than one price for one product in a chain, then calculate mean:
grouped_df = df.groupby(['Month_Year', 'City_Name', 'Chain_Name', 'Product_Name']).agg('mean')

result = dict()
nested_dict = dict()

for index, value in grouped_df.itertuples():
    for i, key in enumerate(index):
        if i == 0:
            if not key in result:
                result[key] = {}
            nested_dict = result[key]
        elif i == len(index) - 1:
            nested_dict[key] = value
        else:
            if not key in nested_dict:
                nested_dict[key] = {}
            nested_dict = nested_dict[key]

print(json.dumps(result, indent=4))

Changing your df to show nested dict and mean calculation to:

  Month_Year City_Name Chain_Name Product_Name  Product_Price
0    11-2021    London       Aldi        Pasta           2.33
1    11-2021    London       Aldi        Pasta           2.35
2    11-2021    London       Aldi       Olives           3.99
3    11-2021   Bristol       Spar      Bananas           1.45
4    10-2021    London      Tesco       Olives           4.12
5    10-2021   Cardiff       Spar        Pasta           2.25

You get the output:

{
    "10-2021": {
        "Cardiff": {
            "Spar": {
                "Pasta": 2.25
            }
        },
        "London": {
            "Tesco": {
                "Olives": 4.12
            }
        }
    },
    "11-2021": {
        "Bristol": {
            "Spar": {
                "Bananas": 1.45
            }
        },
        "London": {
            "Aldi": {
                "Olives": 3.99,
                "Pasta": 2.34
            }
        }
    }
}

Create a nested dictionary from a dataframe

Following python code is the solution for your problem

import pandas as pd

d = {"field_name": ["foo", "foo", "foo", "bar", "bar"],
     "values": ["key1", "key2", "key3", "key1", "key5"],
     "description": ["value1", "value2", "value3", "value4", "value6"]}
df = pd.DataFrame(data=d)
print(df.values)

resultant_dict = {}
"""
df.values is like
[['foo' 'key1' 'value1']
 ['foo' 'key2' 'value2']
 ['foo' 'key3' 'value3']
 ['bar' 'key1' 'value4']
 ['bar' 'key5' 'value6']]
"""
for i in df.values:
    if i[0] in resultant_dict:
        resultant_dict[i[0]][i[1]] = i[2]
    else:
        resultant_dict[i[0]] = {i[1]: i[2]}

print(resultant_dict)

# Resultant Dict is {'foo': {'key1': 'value1', 'key2': 'value2', 'key3': 'value3'}, 'bar': {'key1': 'value4', 
# 'key5': 'value6'}}

converting a nested dictionary to Pandas DataFrame

You can simply use:

df = pd.DataFrame(d['result']).T

Or:

df = pd.DataFrame.from_dict(d['result'], orient='index')

Output:

             A   B   C  D
2011-12-01  53  28  32  0
2012-01-01  51  35  49  0
2012-02-01  63  32  56  0

Construct Pandas Dataframe from Items in Nested Dictionary