How to Uniqify a List of Dict in Python

how to uniqify a list of dict in python

If your value is hashable this will work:

>>> [dict(y) for y in set(tuple(x.items()) for x in d)]
[{'y': 4, 'x': 3}, {'y': 2, 'x': 1}]

EDIT:

I tried it with no duplicates and it seemed to work fine

>>> d = [{'x':1, 'y':2}, {'x':3, 'y':4}]
>>> [dict(y) for y in set(tuple(x.items()) for x in d)]
[{'y': 4, 'x': 3}, {'y': 2, 'x': 1}]

and

>>> d = [{'x':1,'y':2}]
>>> [dict(y) for y in set(tuple(x.items()) for x in d)]
[{'y': 2, 'x': 1}]

List of unique dictionaries

So make a temporary dict with the key being the id. This filters out the duplicates.
The values() of the dict will be the list

In Python2.7

>>> L=[
... {'id':1,'name':'john', 'age':34},
... {'id':1,'name':'john', 'age':34},
... {'id':2,'name':'hanna', 'age':30},
... ]
>>> {v['id']:v for v in L}.values()
[{'age': 34, 'id': 1, 'name': 'john'}, {'age': 30, 'id': 2, 'name': 'hanna'}]

In Python3

>>> L=[
... {'id':1,'name':'john', 'age':34},
... {'id':1,'name':'john', 'age':34},
... {'id':2,'name':'hanna', 'age':30},
... ]
>>> list({v['id']:v for v in L}.values())
[{'age': 34, 'id': 1, 'name': 'john'}, {'age': 30, 'id': 2, 'name': 'hanna'}]

In Python2.5/2.6

>>> L=[
... {'id':1,'name':'john', 'age':34},
... {'id':1,'name':'john', 'age':34},
... {'id':2,'name':'hanna', 'age':30},
... ]
>>> dict((v['id'],v) for v in L).values()
[{'age': 34, 'id': 1, 'name': 'john'}, {'age': 30, 'id': 2, 'name': 'hanna'}]

getting unique values from a list of dict

import ast

l = [
{'x':'1','y':'1'},{'x':'2','y':'2'},{'x':'1','y':'1'}
]

[ast.literal_eval(el1) for el1 in set([str(el2) for el2 in l])]

Usually an easy solution for keeping unique elements is to add them to a set. However, since a dict is unhashable (can't be put in a set), I provided a workaround. First, the dicts are converted to strings, placed in a set (to keep the unique ones), and then converted back to dicts using ast.literal_eval.

uniqify a list of dictionaries

Use Mac Address,group-addr,vlan and ver as the key to group common elements, you should do this when you create the dict originally but this is an example using the data from your question :

foo = """Port  Mac Address       group-addr      vlan    ver
s2p2 0100.5e00.0004 239.0.0.4 1 1
s2p0 0100.5e00.0005 239.0.0.8 1 1
s2p1 0100.5e00.0004 239.0.0.4 1 1"""

from collections import defaultdict
d = defaultdict(set)
lines = foo.splitlines()

for line in lines[1:]:
prt,mc,gp,vl,vr = line.split()
d[(mc,gp,vl,vr)].add(prt)
print(d)
defaultdict(<type 'set'>, {('0100.5e00.0004', '239.0.0.4', '1', '1'): set(['s2p2', 's2p1']), ('0100.5e00.0005', '239.0.0.8', '1', '1'): set(['s2p0'])})


print "%s %10s %14s %15s" % ("Vlan", "Group", "Version", "Port List")
print "---------------------------------------------------------"
for mc, gp, vl, vr in d:
print("{:<10} {:<14} {:<15}".format(vl,gp,vr)) +",".join(d[mc, gp, vl, v])

Vlan Group Version Port List
---------------------------------------------------------
1 239.0.0.4 1 s2p2,s2p1
1 239.0.0.8 1 s2p0

How make unique a list of nested dictionaries in python

You'd need to track if you have seen a dictionary already. Unfortunately, dictionaries are not hashable, and do not track order, so you need to convert dictionaries to something that is hashable. A frozenset() of the key-value pairs (as tuples) would do, but then you need to flatten recursively:

def set_from_dict(d):
return frozenset(
(k, set_from_dict(v) if isinstance(v, dict) else v)
for k, v in d.iteritems())

These frozenset() objects represent the dictionary values enough to track unique items:

seen = set()
result = []
for d in inputlist:
representation = set_from_dict(d)
if representation in seen:
continue
result.append(d)
seen.add(representation)

This preserves the original order of your input list, minus duplicates. If you are using Python 2.7 and up, an OrderedDict would have been helpful here, but you are using Python 2.6, so we need to do it slightly more verbosely.

The above approach takes O(N) time, one step per input dictionary, as testing against a set takes only O(1) constant time.

Demo:

>>> inputlist = [{'permission': 'full',
... 'permission_type': 'allow',
... 'trustee': {'id': 'SID:S-1-5-32-545',
... 'name': 'Users',
... 'type': 'group'}},
... {'permission': 'full',
... 'permission_type': 'allow',
... 'trustee': {'id': 'SID:S-1-5-32-545',
... 'name': 'Users',
... 'type': 'group'}},
... {'permission': 'full',
... 'permission_type': 'allow',
... 'trustee': {'id': 'SID:S-1-5-32-544',
... 'name': 'Administrators',
... 'type': 'group'}}]
>>> def set_from_dict(d):
... return frozenset(
... (k, set_from_dict(v) if isinstance(v, dict) else v)
... for k, v in d.iteritems())
...
>>> seen = set()
>>> result = []
>>> for d in inputlist:
... representation = set_from_dict(d)
... if representation in seen:
... continue
... result.append(d)
... seen.add(representation)
...
>>> from pprint import pprint
>>> pprint(result)
[{'permission': 'full',
'permission_type': 'allow',
'trustee': {'id': 'SID:S-1-5-32-545', 'name': 'Users', 'type': 'group'}},
{'permission': 'full',
'permission_type': 'allow',
'trustee': {'id': 'SID:S-1-5-32-544',
'name': 'Administrators',
'type': 'group'}}]

Group and aggregate a list of dictionaries by multiple keys

Using pure python, you can do insert into an OrderedDict to retain insertion order:

from collections import OrderedDict

d = OrderedDict()
for l in lst:
d.setdefault((l['number'], l['favorite']), set()).add(l['color'])

[{'number': k[0], 'favorite': k[1], 'color': v.pop() if len(v) == 1 else v}
for k, v in d.items()]
# [{'color': {'green', 'red'}, 'favorite': False, 'number': 1},
# {'color': 'red', 'favorite': True, 'number': 1},
# {'color': 'red', 'favorite': False, 'number': 2}]

This can also be done quite easily using the pandas GroupBy API:

import pandas as pd

d = (pd.DataFrame(lst)
.groupby(['number', 'favorite'])
.color
.agg(set)
.reset_index()
.to_dict('r'))
d
# [{'color': {'green', 'red'}, 'favorite': False, 'number': 1},
# {'color': {'red'}, 'favorite': True, 'number': 1},
# {'color': {'red'}, 'favorite': False, 'number': 2}]

If the condition of a string for a single element is required, you can use

[{'color': (lambda v: v.pop() if len(v) == 1 else v)(d_.pop('color')), **d_} 
for d_ in d]
# [{'color': {'green', 'red'}, 'favorite': False, 'number': 1},
# {'color': 'red', 'favorite': True, 'number': 1},
# {'color': 'red', 'favorite': False, 'number': 2}]

Fastest way to uniqify a list in Python

set([a, b, c, a])

Leave it in that form if possible.



Related Topics



Leave a reply



Submit