Grouping Python dictionary keys as a list and create a new dictionary with this list as a value
Using collections.defaultdict
for ease:
from collections import defaultdict
v = defaultdict(list)
for key, value in sorted(d.items()):
v[value].append(key)
but you can do it with a bog-standard dict
too, using dict.setdefault()
:
v = {}
for key, value in sorted(d.items()):
v.setdefault(value, []).append(key)
The above sorts keys first; sorting the values of the output dictionary later is much more cumbersome and inefficient.
If anyone would not need the output to be sorted, you can drop the sorted()
call, and use sets (the keys in the input dictionary are guaranteed to be unique, so no information is lost):
v = {}
for key, value in d.items():
v.setdefault(value, set()).add(key)
to produce:
{6: {1}, 1: {2, 3, 6}, 9: {4, 5}}
(that the output of the set values is sorted is a coincidence, a side-effect of how hash values for integers are implemented; sets are unordered structures).
Group dictionary keys to list of list based on same values
ini_dict = {'u1': 0, 'u2': 0, 'u3': 1, 'u4': 2, 'u5': 2, 'u6': 3, 'u7': 4, 'u8': 4, 'u9': 3}
flipped = {}
for key, value in ini_dict.items():
if value not in flipped:
flipped[value] = [key]
else:
flipped[value].append(key)
Output will be
Result [['u1', 'u2'], ['u3'], ['u4', 'u5'], ['u6', 'u9'], ['u7', 'u8']]
Simply Flipping the values and creating new dictionary with values and list of keys on which they are iterated would do the magic. Just look for Duplicate values in the reversed dictionary.
python : group by multiple dictionary keys
You can use groupby()
to group dicts in combination with itemgetter()
:
from itertools import groupby
from operator import itemgetter
list_pts = [
{'city': 'Madrid', 'year': '2017', 'date': '05/07/2017', 'pts': 7},
{'city': 'Madrid', 'year': '2017', 'date': '14/11/2017', 'pts': 5},
{'city': 'Londres', 'year': '2018', 'date': '25/02/2018', 'pts': 5},
{'city': 'Paris', 'year': '2019', 'date': '17/04/2019', 'pts' : 4},
{'city': 'Londres', 'year': '2019', 'date': '15/06/2019', 'pts': 8},
{'city': 'Paris', 'year': '2019', 'date': '21/08/2019', 'pts': 8},
{'city': 'Londres', 'year': '2019', 'date': '04/12/2019', 'pts': 2}
]
city_year_getter = itemgetter('city', 'year')
date_pts_getter = itemgetter('date', 'pts')
result = []
for (city, year), objs in groupby(sorted(list_pts, key=city_year_getter),
city_year_getter):
dates, ptss = zip(*map(date_pts_getter, objs))
result.append({
'city': city,
'year': year,
'date': list(dates),
'pts': list(ptss)
})
How to create a dictionary by grouping key for different values in python?
You can aggregate the values as a list on groupby, then export as dict:
df.groupby('Key')['Value'].agg(list).to_dict()
Result:
{'key1': ['value1', 'value4'],
'key2': ['value2', 'value1'],
'key3': ['value3', 'value2'],
'key5': ['value5']}
Group dictionary values together based on a key
You can use itertools.groupby
to accomplish this. Basically groupby the first element in the tuple, then make a list of the second elements in each group.
>>> from itertools import groupby
>>> [(k, [i[1] for i in g]) for k, g in groupby(sorted(values), key=lambda i: i[0])]
[(93646, [2017.0, 2020.0]), (250128, [2020.0]), (304008, [2017.0, 2020.0])]
list to dict with a group by on first value
loop the list, check if the key is in the dict. if it is then add to the dict of that key. if its not then create the keys dict:
a = [
('A', 'B', 8),
('A', 'D', 10),
('A', 'E', 12),
('B', 'C', 6),
('B', 'F', 12),
('C', 'F', 8),
('D', 'E', 10),
('D', 'G', 30),
('E', 'F', 10),
('F', 'G', 12)
]
final_list = {}
for item in a:
if item[0] in final_list.keys():
final_list[item[0]][item[1]] = item[2]
else:
final_list[item[0]] = {item[1]: item[2]}
print(final_list)
List dictionary in Python to group by in key's value
What you did
You managed to group the dictionaries by a key (date
's value).
Then, inside the iteration, you printed the key and the resulting grouped list of dictionaries.
Good start!
What is missing
Inside the group-by iteration:
- add the key to a new dictionary, e.g. as entry
'date': 'new Date (2017,1,1)'
. - Loop through each dictionary in the grouper (term or generator used by itertools). For each dictionary:
- exclude the entry with
date
key (because it already is added to the new) - but add all accident-summaries with their variable
<accident-category>
key and a number as value. See following illustration what needs to be added as key-value pairs:
{.. , 'Fall': 5}
{.. , 'Vehicular Accident': 127}
{.. , 'Medical': 129}
{.. , 'OB': 10}
{.. , 'Mauling': 9}
- At the end, in the group-by iteration, add the new dictionary to a list (of course the list needs to be defined empty before the group-by block).
Solution
Warning: Lambda used instead operator.itemgetter
The key-extractor passed to argument key
to function groupby
can also be a lambda.
I know this might be an advanced topic to get familiar with. Though, for a simple key-extraction and to get rid of the additional itemgetter
dependency, it could be useful. Read more about lambda's in the tutorial below.
from itertools import groupby
accidents_stats = []
for js_date, grouper in groupby(accidents_dict, key=lambda element: element['date']):
accidents_per_date = {date: js_date} # this starts the new dict
for d in grouper: # here we iterate over each accident in group, as dict
for (k,v) in d.items(): # for each (key,value) pair in dict entries
if k != 'date': # add only non-date entries
accidents_per_date[k] = v
accidents_stats.append(accidents_per_date) # then add the new dict to a list
print(accidents_stats)
Prints:
[{'date': 'new Date (2017,1,1)', 'Fall': 5, 'Vehicular Accident': 127, 'Medical': 129, 'OB': 10, 'Mauling': 9}, {'date': 'new Date (2017,2,1)', 'Fall': 7, 'Vehicular Accident': 113, 'Mauling': 5, 'OB': 9, 'Medical': 79}, {'date': 'new Date (2017,3,1)', 'Medical': 112, 'Mauling': 5, 'OB': 11, 'Vehicular Accident': 119, 'Fall': 8}]
See also:
- itertools — Functions creating iterators for efficient looping — Python 3.10.1 documentation
- How to Use Python Lambda Functions – Real Python
Bonus: a universal way to represent time in JSON
JSON is an acronym for "JavaScript Object Notation". As such JSON can be evaluated by JavaScript.
To represent timestamps, date or time in JSON we often use
- Strings that follow a standard-format like ISO and are human-readable, but not automatically readable by all languages
- Numbers that represent the milliseconds in (UNIX-based) epoch format
Since the number can be easily read by JavaScript, and other languages we can parse and convert the JS-constructor statement to epoch integer, using:
- regular expression to extract the numbers (3 capture-groups within parentheses)
datetime.timestamp()
function to construct a date in Python and convert to epoch in milliseconds (as required by JavaScript)
import re
import datetime
# JavaScript Date objects contain a Number that represents milliseconds since 1 January 1970 UTC. Also called (UNIX-) epoch format.
def js_code_to_millis(js_date_constructor):
result = re.search(r"new Date\s+\((\d{4}),(\d{1,2}),(\d{1,2})\)", js_date_constructor)
(year, month, day) = result.groups()
epoch_seconds = datetime.datetime(int(year), int(month), int(day)).timestamp()
return int(epoch_seconds * 1000)
See also:
- Date - JavaScript | MDN
- Python Regex Capturing Groups – PYnative
- How can I convert a datetime object to milliseconds since epoch (unix time) in Python?
- Python datetime to epoch
Want valid JSON as output?
Then follow 4 adjustments
- Add the
js_code_to_millis
function (including imports) - Include the JSON-compliant timestamp to new dictionary as entry
'date' : js_code_to_millis(js_date)
. - Import built-in JSON module as
import json
. - After or instead of printing the list, dump it as JSON-array using
json.dumps(accidents_stats)
.
Grouping and sorting dictionary and and adding a key value
IIUC, you could do the following:
import pprint
from collections import defaultdict
from operator import itemgetter
players = [{'Name': 'Player 1', 'Pos': 'RB', 'Team': 'MIN', 'Salary': '9400'},
{'Name': 'Player 2', 'Pos': 'RB', 'Team': 'MIN', 'Salary': '8400'},
{'Name': 'Player 3', 'Pos': 'RB', 'Team': 'MIN', 'Salary': '7400'},
{'Name': 'Player 2', 'Pos': 'RB', 'Team': 'NYG', 'Salary': '8400'},
{'Name': 'Player 3', 'Pos': 'RB', 'Team': 'NYG', 'Salary': '7400'}]
team_and_position = itemgetter('Team', 'Pos')
salary = itemgetter('Salary')
# create groups of positions within a given team
groups = defaultdict(list)
for player in players:
groups[team_and_position(player)].append(player)
# sort each group by salary in descending order
groups = {k: sorted(group, key=salary, reverse=True) for k, group in groups.items()}
# add depth value to each player
res = {k: [{**player, "Depth": depth} for depth, player in enumerate(group, 1)] for k, group in groups.items()}
# fetch MIN RB
vikings_rb = res[('MIN', 'RB')]
pprint.pprint(vikings_rb)
Output
[{'Depth': 1, 'Name': 'Player 1', 'Pos': 'RB', 'Salary': '9400', 'Team': 'MIN'},
{'Depth': 2, 'Name': 'Player 2', 'Pos': 'RB', 'Salary': '8400', 'Team': 'MIN'},
{'Depth': 3, 'Name': 'Player 3', 'Pos': 'RB', 'Salary': '7400', 'Team': 'MIN'}]
The first step is to use group the elements of the list by Team and Pos, for this you could use defaultdict and itemgetter to extract the values of the keys:
# create groups of positions within a given team
groups = defaultdict(list)
for player in players:
groups[team_and_position(player)].append(player)
The second step is to to sort within each group by salary in descending order:
# sort each group by salary in descending order
groups = {k: sorted(group, key=salary, reverse=True) for k, group in groups.items()}
The second step could be done in-place but I prefer a dictionary comprehension. Finally use enumerate (starting from 1) to add the depth value to each dictionary:
# add depth value to each player
res = {k: [{**player, "Depth": depth} for depth, player in enumerate(group, 1)] for k, group in groups.items()}
Again this could be done in-place, doing the following:
for group in groups.values():
for depth, player in enumerate(group, 1):
player['Depth'] = depth
UPDATE
If you want to fetch all players, just flatten the values of the dictionary:
# fetch ALL players
all_players = [player for group in res.values() for player in group]
pprint.pprint(all_players)
Output
[{'Depth': 1, 'Name': 'Player 1', 'Pos': 'RB', 'Salary': '9400', 'Team': 'MIN'},
{'Depth': 2, 'Name': 'Player 2', 'Pos': 'RB', 'Salary': '8400', 'Team': 'MIN'},
{'Depth': 3, 'Name': 'Player 3', 'Pos': 'RB', 'Salary': '7400', 'Team': 'MIN'},
{'Depth': 1, 'Name': 'Player 2', 'Pos': 'RB', 'Salary': '8400', 'Team': 'NYG'},
{'Depth': 2, 'Name': 'Player 3', 'Pos': 'RB', 'Salary': '7400', 'Team': 'NYG'}]
After this, sort the all_players list by any given criteria.
python dictionary group by , order by and create a new key based on rank
First make a list of dictionaries sorted by location and time. This will put the groups together and within the groups they will be sorted by time:
l = [
{"name":"Alex","location":"US","time":"2020-05-20 10:36:20"},
{"name":"Bob","location":"India","time":"2017-05-20 12:36:20"},
{"name":"Jon","location":"US","time":"2017-05-20 05:36:20"},
{"name":"Kerry","location":"India","time":"2014-05-20 05:36:20"},
{"name":"Mat","location":"US","time":"2013-01-20 05:36:20"},
{"name":"Sazen","location":"India","time":"2013-01-20 05:36:20"}
]
l_sort = sorted(l, key=lambda d: (d['location'], d['time']))
Now you have a list l_sort
that looks like:
[{'name': 'Sazen', 'location': 'India', 'time': '2013-01-20 05:36:20'},
{'name': 'Kerry', 'location': 'India', 'time': '2014-05-20 05:36:20'},
{'name': 'Bob', 'location': 'India', 'time': '2017-05-20 12:36:20'},
{'name': 'Mat', 'location': 'US', 'time': '2013-01-20 05:36:20'},
{'name': 'Jon', 'location': 'US', 'time': '2017-05-20 05:36:20'},
{'name': 'Alex', 'location': 'US', 'time': '2020-05-20 10:36:20'}]
Now that everything is in the correct place you can use itertools.groupby
from the standard library to make groups based on location, then for each dict in each group update the dictionary:
from itertools import groupby
# group by location
groups = groupby(l_sort, key=lambda d: d['location'])
# for each location
for k, group in groups:
# update the dicts with the correct index starting at 1
for i, d in enumerate(group, 1):
d['new_name'] = f"{d['name']}_{i}"
This will update the dicts in place, so your original list will now have dicts like:
[{'name': 'Alex','location': 'US','time': '2020-05-20 10:36:20','new_name': 'Alex_3'},
{'name': 'Bob','location': 'India','time': '2017-05-20 12:36:20','new_name': 'Bob_3'},
{'name': 'Jon','location': 'US','time': '2017-05-20 05:36:20','new_name': 'Jon_2'},
{'name': 'Kerry','location': 'India','time': '2014-05-20 05:36:20','new_name': 'Kerry_2'},
{'name': 'Mat','location': 'US','time': '2013-01-20 05:36:20','new_name': 'Mat_1'},
{'name': 'Sazen','location': 'India','time': '2013-01-20 05:36:20','new_name': 'Sazen_1'}]
Related Topics
Why Does Foo.Append(Bar) Affect All Elements in a List of Lists
Truth Value of a String in Python
Lag When Win.Blit() Background Pygame
What's a Good Rate Limiting Algorithm
Pip Uses Incorrect Cached Package Version, Instead of the User-Specified Version
Can "List_Display" in a Django Modeladmin Display Attributes of Foreignkey Fields
Count Unique Values Per Groups with Pandas
Logging Uncaught Exceptions in Python
Pandas Dataframe Groupby Two Columns and Get Counts
What's the Difference Between Dist-Packages and Site-Packages
Add Text to Existing PDF Using Python
Safe Method to Get Value of Nested Dictionary
Making Python/Tkinter Label Widget Update
How to Avoid "Runtimeerror: Dictionary Changed Size During Iteration" Error
How to Create a Namespace Package in Python
Typeerror: Not All Arguments Converted During String Formatting Python