Join Two Lists of Dictionaries on a Single Key

join two lists of dictionaries on a single key

from collections import defaultdict

l1 = [{"index":1, "b":2}, {"index":2, "b":3}, {"index":3, "green":"eggs"}]
l2 = [{"index":1, "c":4}, {"index":2, "c":5}]

d = defaultdict(dict)
for l in (l1, l2):
for elem in l:
d[elem['index']].update(elem)
l3 = d.values()

# l3 is now:

[{'b': 2, 'c': 4, 'index': 1},
{'b': 3, 'c': 5, 'index': 2},
{'green': 'eggs', 'index': 3}]

EDIT: Since l3 is not guaranteed to be sorted (.values() returns items in no specific order), you can do as @user560833 suggests:

from operator import itemgetter

...

l3 = sorted(d.values(), key=itemgetter("index"))

Left join two lists of dictionaries on a single key

For efficiency, you can start by building a dict with the indices as keys, and the corresponding dicts of l2 as values, so that you don't have to go through l2 each time you look for a matching dict in it.

You can then build a new list of dicts: for each dict in l1, we make a copy of it in order to leave the original unchanged, and update it with the matching dict from l2.

l1 = [{"index":1, "b":2}, {"index":2, "b":3}, {"index":3, "b":"10"}, {"index":4, "c":"7"}]

l2 = [{"index":1, "c":4}, {"index":2, "c":5}, {"index":6, "c":8}, {"index":7, "c":9}]

dict2 = {dct['index']:dct for dct in l2}

out = []
for d1 in l1:
d = dict(**d1)
d.update(dict2.get(d1['index'], {}))
out.append(d)

print(out)
# [{'index': 1, 'b': 2, 'c': 4}, {'index': 2, 'b': 3, 'c': 5}, {'index': 3, 'b': '10'}, {'index': 4, 'c': '7'}]

How to merge two list of dictionaries based on a value

You can keep track of the ids with another dict (or defaultdict to make things simpler). Then update the items in that dict as you iterate. In the end the dict's values will have your list.

from collections import defaultdict
d = defaultdict(dict)

a = [{'id': 1, 'name': 'a'}, {'id': 3, 'name': 'a'}]
b = [{'id': 1, 'city': 'b'}, {'id': 2, 'city': 'c'}, {'id': 3, 'city': 'd'}]

for item in a + b:
d[item['id']].update(item)
list(d.values())

# [{'id': 1, 'name': 'a', 'city': 'b'},
# {'id': 3, 'name': 'a', 'city': 'd'},
# {'id': 2, 'city': 'c'}]

Note this will overwrite duplicate values other than id — so if you have two with id: 1 and two different cities, you will only get the last city.

Merge two lists of dicts of different lengths using a single key in Python

Problem

The problem in your approach is that your data is put in a grouped dict with 'pcd_sector' as keys but your l2 has multiple dicts with the same 'pcd_sector'. You could use a tuple of 'pcd_sector', 'asset' as key for l2, but it wouldn't work for l1 anymore. So you need to do the processing in two steps instead of iterating on l1 + l2 directly.

Theory

If pcd_sector keys are unique in l1, you can create a big dict instead of a list of small dicts:

>>> d1 = {d['pcd_sector']:d for d in l1}
>>> d1
{'ABDC': {'pcd_sector': 'ABDC', 'coverage_2014': '100'}, 'DEFG': {'pcd_sector': 'DEFG', 'coverage_2014': '0'}}

Then, you simply need to merge the dicts that have the same pcd_sector keys:

>>> [dict(d, **d1.get(d['pcd_sector'], {})) for d in l2]
[{'asset_id': '2gs', 'coverage_2014': '100', 'pcd_sector': 'ABDC', 'asset': '3G'}, {'asset_id': '7jd', 'coverage_2014': '100', 'pcd_sector': 'ABDC', 'asset': '4G'}, {'asset_id': '3je', 'coverage_2014': '0', 'pcd_sector': 'DEFG', 'asset': '3G'}, {'asset_id': '8js', 'coverage_2014': '0', 'pcd_sector': 'DEFG', 'asset': '4G'}, {'asset_id': '4jd', 'pcd_sector': 'CDEF', 'asset': '3G'}]

Complete code

Putting it all together, the code becomes:

l1 = [{'pcd_sector': 'ABDC', 'coverage_2014': '100'},
{'pcd_sector': 'DEFG', 'coverage_2014': '0'}]

l2 = [{'pcd_sector': 'ABDC', 'asset': '3G', 'asset_id': '2gs'},
{'pcd_sector': 'ABDC', 'asset': '4G', 'asset_id': '7jd'},
{'pcd_sector': 'DEFG', 'asset': '3G', 'asset_id': '3je'},
{'pcd_sector': 'DEFG', 'asset': '4G', 'asset_id': '8js'},
{'pcd_sector': 'CDEF', 'asset': '3G', 'asset_id': '4jd'}]

d1 = {d['pcd_sector']:d for d in l1}
result = [dict(d, **d1.get(d['pcd_sector'], {})) for d in l2]

import pprint
pprint.pprint(result)
# [{'asset': '3G',
# 'asset_id': '2gs',
# 'coverage_2014': '100',
# 'pcd_sector': 'ABDC'},
# {'asset': '4G',
# 'asset_id': '7jd',
# 'coverage_2014': '100',
# 'pcd_sector': 'ABDC'},
# {'asset': '3G',
# 'asset_id': '3je',
# 'coverage_2014': '0',
# 'pcd_sector': 'DEFG'},
# {'asset': '4G',
# 'asset_id': '8js',
# 'coverage_2014': '0',
# 'pcd_sector': 'DEFG'},
# {'asset': '3G', 'asset_id': '4jd', 'pcd_sector': 'CDEF'}]

Merge dictionaries with same key from two lists of dicts in python

Here is one of the approach:

a = {
"name":"harry",
"properties":[
{
"id":"N3",
"status":"OPEN",
"type":"energetic"
},
{
"id":"N5",
"status":"OPEN",
"type":"hot"
}
]
}
b = {
"name":"harry",
"properties":[
{
"id":"N3",
"type":"energetic",
"language": "english"
},
{
"id":"N6",
"status":"OPEN",
"type":"cool"
}
]
}

# Create dic maintaining the index of each id in resp dict
a_ids = {item['id']: index for index,item in enumerate(a['properties'])} #{'N3': 0, 'N5': 1}
b_ids = {item['id']: index for index,item in enumerate(b['properties'])} #{'N3': 0, 'N6': 1}

# Loop through one of the dict created
for id in a_ids.keys():
# If same ID exists in another dict, update it with the key value
if id in b_ids:
b['properties'][b_ids[id]].update(a['properties'][a_ids[id]])
# If it does not exist, then just append the new dict
else:
b['properties'].append(a['properties'][a_ids[id]])


print (b)

Output:

{'name': 'harry', 'properties': [{'id': 'N3', 'type': 'energetic', 'language': 'english', 'status': 'OPEN'}, {'id': 'N6', 'status': 'OPEN', 'type': 'cool'}, {'id': 'N5', 'status': 'OPEN', 'type': 'hot'}]}

Merge two (or more) lists of dictionaries pairing using a specific key

I'm not sure if this is more efficient than your solution:

from operator import itemgetter
from itertools import chain, groupby

a = [{'idx': 1, 'foo': 'xx1', 'bar': 'yy1'},
{'idx': 0, 'foo': 'xx0', 'bar': 'yy0'},
{'idx': 2, 'foo': 'xx2', 'bar': 'yy2'}]
b = [{'idx': 0, 'fie': 'zz0', 'fom': 'kk0'},
{'idx': 3, 'fie': 'zz3', 'fom': 'kk3'},
{'idx': 1, 'fie': 'zz1', 'fom': 'kk1'}]

c = sorted(a + b, key=itemgetter('idx'))
c = [
dict(chain(*(record.items() for record in group)))
for _, group in groupby(c, key=itemgetter('idx'))
]

Result:

[{'idx': 0, 'foo': 'xx0', 'bar': 'yy0', 'fie': 'zz0', 'fom': 'kk0'},
{'idx': 1, 'foo': 'xx1', 'bar': 'yy1', 'fie': 'zz1', 'fom': 'kk1'},
{'idx': 2, 'foo': 'xx2', 'bar': 'yy2'},
{'idx': 3, 'fie': 'zz3', 'fom': 'kk3'}]

Merge two lists into one dict that may contain several values for each key

Python does not have multidicts, so you have two options:

  1. use multidicts from an existing library e.g. werkzeug's MultiDict, when initialised with lists of pairs, will associate multiple values to keys which are present multiple times (unlike dict which will only keep the last value)

    system_data_dict = werkzeug.datastructures.MultiDict(zip(system, instrument))
    system_data_dict.getlist('System A') # => ['Instrument 1', 'Instrument 2']
  2. alternatively, do that by hand by using a regular loop, even if it's feasible using a comprehension (which I'm not sure about) it's going to look dreadful: use a defaultdict or the dict.setdefault method to define a dict mapping keys to lists and append values every time

    system_data_dict = {}
    for k, v in zip(system, instrument):
    system_data_dict.setdefault(k, []).append(v)
    system_data_dict['System A'] # => ['Instrument 1', 'Instrument 2']


Related Topics



Leave a reply



Submit