Grouping Python Dictionary Keys as a List and Create a New Dictionary with This List as a Value

Grouping Python dictionary keys as a list and create a new dictionary with this list as a value

Using collections.defaultdict for ease:

from collections import defaultdict

v = defaultdict(list)

for key, value in sorted(d.items()):
v[value].append(key)

but you can do it with a bog-standard dict too, using dict.setdefault():

v = {}

for key, value in sorted(d.items()):
v.setdefault(value, []).append(key)

The above sorts keys first; sorting the values of the output dictionary later is much more cumbersome and inefficient.

If anyone would not need the output to be sorted, you can drop the sorted() call, and use sets (the keys in the input dictionary are guaranteed to be unique, so no information is lost):

v = {}

for key, value in d.items():
v.setdefault(value, set()).add(key)

to produce:

{6: {1}, 1: {2, 3, 6}, 9: {4, 5}}

(that the output of the set values is sorted is a coincidence, a side-effect of how hash values for integers are implemented; sets are unordered structures).

Group dictionary keys to list of list based on same values

ini_dict = {'u1': 0, 'u2': 0, 'u3': 1, 'u4': 2, 'u5': 2, 'u6': 3, 'u7': 4, 'u8': 4, 'u9': 3}
flipped = {}
for key, value in ini_dict.items():
if value not in flipped:
flipped[value] = [key]
else:
flipped[value].append(key)

Output will be

Result [['u1', 'u2'], ['u3'], ['u4', 'u5'], ['u6', 'u9'], ['u7', 'u8']]

Simply Flipping the values and creating new dictionary with values and list of keys on which they are iterated would do the magic. Just look for Duplicate values in the reversed dictionary.

python : group by multiple dictionary keys

You can use groupby() to group dicts in combination with itemgetter():

from itertools import groupby
from operator import itemgetter

list_pts = [
{'city': 'Madrid', 'year': '2017', 'date': '05/07/2017', 'pts': 7},
{'city': 'Madrid', 'year': '2017', 'date': '14/11/2017', 'pts': 5},
{'city': 'Londres', 'year': '2018', 'date': '25/02/2018', 'pts': 5},
{'city': 'Paris', 'year': '2019', 'date': '17/04/2019', 'pts' : 4},
{'city': 'Londres', 'year': '2019', 'date': '15/06/2019', 'pts': 8},
{'city': 'Paris', 'year': '2019', 'date': '21/08/2019', 'pts': 8},
{'city': 'Londres', 'year': '2019', 'date': '04/12/2019', 'pts': 2}
]

city_year_getter = itemgetter('city', 'year')
date_pts_getter = itemgetter('date', 'pts')

result = []
for (city, year), objs in groupby(sorted(list_pts, key=city_year_getter),
city_year_getter):
dates, ptss = zip(*map(date_pts_getter, objs))
result.append({
'city': city,
'year': year,
'date': list(dates),
'pts': list(ptss)
})

How to create a dictionary by grouping key for different values in python?

You can aggregate the values as a list on groupby, then export as dict:

df.groupby('Key')['Value'].agg(list).to_dict()

Result:

{'key1': ['value1', 'value4'],
'key2': ['value2', 'value1'],
'key3': ['value3', 'value2'],
'key5': ['value5']}

Group dictionary values together based on a key

You can use itertools.groupby to accomplish this. Basically groupby the first element in the tuple, then make a list of the second elements in each group.

>>> from itertools import groupby
>>> [(k, [i[1] for i in g]) for k, g in groupby(sorted(values), key=lambda i: i[0])]
[(93646, [2017.0, 2020.0]), (250128, [2020.0]), (304008, [2017.0, 2020.0])]

list to dict with a group by on first value

loop the list, check if the key is in the dict. if it is then add to the dict of that key. if its not then create the keys dict:

a = [
('A', 'B', 8),
('A', 'D', 10),
('A', 'E', 12),
('B', 'C', 6),
('B', 'F', 12),
('C', 'F', 8),
('D', 'E', 10),
('D', 'G', 30),
('E', 'F', 10),
('F', 'G', 12)
]

final_list = {}
for item in a:
if item[0] in final_list.keys():
final_list[item[0]][item[1]] = item[2]
else:
final_list[item[0]] = {item[1]: item[2]}
print(final_list)

List dictionary in Python to group by in key's value

What you did

You managed to group the dictionaries by a key (date's value).
Then, inside the iteration, you printed the key and the resulting grouped list of dictionaries.
Good start!

What is missing

Inside the group-by iteration:

  1. add the key to a new dictionary, e.g. as entry 'date': 'new Date (2017,1,1)'.
  2. Loop through each dictionary in the grouper (term or generator used by itertools). For each dictionary:
  • exclude the entry with date key (because it already is added to the new)
  • but add all accident-summaries with their variable <accident-category> key and a number as value. See following illustration what needs to be added as key-value pairs:
{.. , 'Fall': 5}
{.. , 'Vehicular Accident': 127}
{.. , 'Medical': 129}
{.. , 'OB': 10}
{.. , 'Mauling': 9}

  1. At the end, in the group-by iteration, add the new dictionary to a list (of course the list needs to be defined empty before the group-by block).

Solution

Warning: Lambda used instead operator.itemgetter

The key-extractor passed to argument key to function groupby can also be a lambda.
I know this might be an advanced topic to get familiar with. Though, for a simple key-extraction and to get rid of the additional itemgetter dependency, it could be useful. Read more about lambda's in the tutorial below.

from itertools import groupby

accidents_stats = []
for js_date, grouper in groupby(accidents_dict, key=lambda element: element['date']):
accidents_per_date = {date: js_date} # this starts the new dict
for d in grouper: # here we iterate over each accident in group, as dict
for (k,v) in d.items(): # for each (key,value) pair in dict entries
if k != 'date': # add only non-date entries
accidents_per_date[k] = v
accidents_stats.append(accidents_per_date) # then add the new dict to a list

print(accidents_stats)

Prints:

[{'date': 'new Date (2017,1,1)', 'Fall': 5, 'Vehicular Accident': 127, 'Medical': 129, 'OB': 10, 'Mauling': 9}, {'date': 'new Date (2017,2,1)', 'Fall': 7, 'Vehicular Accident': 113, 'Mauling': 5, 'OB': 9, 'Medical': 79}, {'date': 'new Date (2017,3,1)', 'Medical': 112, 'Mauling': 5, 'OB': 11, 'Vehicular Accident': 119, 'Fall': 8}]

See also:

  • itertools — Functions creating iterators for efficient looping — Python 3.10.1 documentation
  • How to Use Python Lambda Functions – Real Python

Bonus: a universal way to represent time in JSON

JSON is an acronym for "JavaScript Object Notation". As such JSON can be evaluated by JavaScript.

To represent timestamps, date or time in JSON we often use

  • Strings that follow a standard-format like ISO and are human-readable, but not automatically readable by all languages
  • Numbers that represent the milliseconds in (UNIX-based) epoch format

Since the number can be easily read by JavaScript, and other languages we can parse and convert the JS-constructor statement to epoch integer, using:

  • regular expression to extract the numbers (3 capture-groups within parentheses)
  • datetime.timestamp() function to construct a date in Python and convert to epoch in milliseconds (as required by JavaScript)
import re
import datetime

# JavaScript Date objects contain a Number that represents milliseconds since 1 January 1970 UTC. Also called (UNIX-) epoch format.
def js_code_to_millis(js_date_constructor):
result = re.search(r"new Date\s+\((\d{4}),(\d{1,2}),(\d{1,2})\)", js_date_constructor)
(year, month, day) = result.groups()
epoch_seconds = datetime.datetime(int(year), int(month), int(day)).timestamp()
return int(epoch_seconds * 1000)

See also:

  • Date - JavaScript | MDN
  • Python Regex Capturing Groups – PYnative
  • How can I convert a datetime object to milliseconds since epoch (unix time) in Python?
  • Python datetime to epoch

Want valid JSON as output?

Then follow 4 adjustments

  1. Add the js_code_to_millis function (including imports)
  2. Include the JSON-compliant timestamp to new dictionary as entry
    'date' : js_code_to_millis(js_date).
  3. Import built-in JSON module as import json.
  4. After or instead of printing the list, dump it as JSON-array using
    json.dumps(accidents_stats).

Grouping and sorting dictionary and and adding a key value

IIUC, you could do the following:

import pprint
from collections import defaultdict
from operator import itemgetter

players = [{'Name': 'Player 1', 'Pos': 'RB', 'Team': 'MIN', 'Salary': '9400'},
{'Name': 'Player 2', 'Pos': 'RB', 'Team': 'MIN', 'Salary': '8400'},
{'Name': 'Player 3', 'Pos': 'RB', 'Team': 'MIN', 'Salary': '7400'},
{'Name': 'Player 2', 'Pos': 'RB', 'Team': 'NYG', 'Salary': '8400'},
{'Name': 'Player 3', 'Pos': 'RB', 'Team': 'NYG', 'Salary': '7400'}]

team_and_position = itemgetter('Team', 'Pos')
salary = itemgetter('Salary')

# create groups of positions within a given team
groups = defaultdict(list)
for player in players:
groups[team_and_position(player)].append(player)

# sort each group by salary in descending order
groups = {k: sorted(group, key=salary, reverse=True) for k, group in groups.items()}

# add depth value to each player
res = {k: [{**player, "Depth": depth} for depth, player in enumerate(group, 1)] for k, group in groups.items()}

# fetch MIN RB
vikings_rb = res[('MIN', 'RB')]

pprint.pprint(vikings_rb)

Output

[{'Depth': 1, 'Name': 'Player 1', 'Pos': 'RB', 'Salary': '9400', 'Team': 'MIN'},
{'Depth': 2, 'Name': 'Player 2', 'Pos': 'RB', 'Salary': '8400', 'Team': 'MIN'},
{'Depth': 3, 'Name': 'Player 3', 'Pos': 'RB', 'Salary': '7400', 'Team': 'MIN'}]

The first step is to use group the elements of the list by Team and Pos, for this you could use defaultdict and itemgetter to extract the values of the keys:

# create groups of positions within a given team
groups = defaultdict(list)
for player in players:
groups[team_and_position(player)].append(player)

The second step is to to sort within each group by salary in descending order:

# sort each group by salary in descending order
groups = {k: sorted(group, key=salary, reverse=True) for k, group in groups.items()}

The second step could be done in-place but I prefer a dictionary comprehension. Finally use enumerate (starting from 1) to add the depth value to each dictionary:

# add depth value to each player
res = {k: [{**player, "Depth": depth} for depth, player in enumerate(group, 1)] for k, group in groups.items()}

Again this could be done in-place, doing the following:

for group in groups.values():
for depth, player in enumerate(group, 1):
player['Depth'] = depth

UPDATE

If you want to fetch all players, just flatten the values of the dictionary:

# fetch ALL players
all_players = [player for group in res.values() for player in group]
pprint.pprint(all_players)

Output

[{'Depth': 1, 'Name': 'Player 1', 'Pos': 'RB', 'Salary': '9400', 'Team': 'MIN'},
{'Depth': 2, 'Name': 'Player 2', 'Pos': 'RB', 'Salary': '8400', 'Team': 'MIN'},
{'Depth': 3, 'Name': 'Player 3', 'Pos': 'RB', 'Salary': '7400', 'Team': 'MIN'},
{'Depth': 1, 'Name': 'Player 2', 'Pos': 'RB', 'Salary': '8400', 'Team': 'NYG'},
{'Depth': 2, 'Name': 'Player 3', 'Pos': 'RB', 'Salary': '7400', 'Team': 'NYG'}]

After this, sort the all_players list by any given criteria.

python dictionary group by , order by and create a new key based on rank

First make a list of dictionaries sorted by location and time. This will put the groups together and within the groups they will be sorted by time:

l = [
{"name":"Alex","location":"US","time":"2020-05-20 10:36:20"},
{"name":"Bob","location":"India","time":"2017-05-20 12:36:20"},
{"name":"Jon","location":"US","time":"2017-05-20 05:36:20"},
{"name":"Kerry","location":"India","time":"2014-05-20 05:36:20"},
{"name":"Mat","location":"US","time":"2013-01-20 05:36:20"},
{"name":"Sazen","location":"India","time":"2013-01-20 05:36:20"}
]

l_sort = sorted(l, key=lambda d: (d['location'], d['time']))

Now you have a list l_sort that looks like:

[{'name': 'Sazen', 'location': 'India', 'time': '2013-01-20 05:36:20'},
{'name': 'Kerry', 'location': 'India', 'time': '2014-05-20 05:36:20'},
{'name': 'Bob', 'location': 'India', 'time': '2017-05-20 12:36:20'},
{'name': 'Mat', 'location': 'US', 'time': '2013-01-20 05:36:20'},
{'name': 'Jon', 'location': 'US', 'time': '2017-05-20 05:36:20'},
{'name': 'Alex', 'location': 'US', 'time': '2020-05-20 10:36:20'}]

Now that everything is in the correct place you can use itertools.groupby from the standard library to make groups based on location, then for each dict in each group update the dictionary:

from itertools import groupby 

# group by location
groups = groupby(l_sort, key=lambda d: d['location'])

# for each location
for k, group in groups:
# update the dicts with the correct index starting at 1
for i, d in enumerate(group, 1):
d['new_name'] = f"{d['name']}_{i}"

This will update the dicts in place, so your original list will now have dicts like:

[{'name': 'Alex','location': 'US','time': '2020-05-20 10:36:20','new_name': 'Alex_3'},
{'name': 'Bob','location': 'India','time': '2017-05-20 12:36:20','new_name': 'Bob_3'},
{'name': 'Jon','location': 'US','time': '2017-05-20 05:36:20','new_name': 'Jon_2'},
{'name': 'Kerry','location': 'India','time': '2014-05-20 05:36:20','new_name': 'Kerry_2'},
{'name': 'Mat','location': 'US','time': '2013-01-20 05:36:20','new_name': 'Mat_1'},
{'name': 'Sazen','location': 'India','time': '2013-01-20 05:36:20','new_name': 'Sazen_1'}]


Related Topics



Leave a reply



Submit