how to uniqify a list of dict in python
If your value is hashable this will work:
>>> [dict(y) for y in set(tuple(x.items()) for x in d)]
[{'y': 4, 'x': 3}, {'y': 2, 'x': 1}]
EDIT:
I tried it with no duplicates and it seemed to work fine
>>> d = [{'x':1, 'y':2}, {'x':3, 'y':4}]
>>> [dict(y) for y in set(tuple(x.items()) for x in d)]
[{'y': 4, 'x': 3}, {'y': 2, 'x': 1}]
and
>>> d = [{'x':1,'y':2}]
>>> [dict(y) for y in set(tuple(x.items()) for x in d)]
[{'y': 2, 'x': 1}]
List of unique dictionaries
So make a temporary dict with the key being the id
. This filters out the duplicates.
The values()
of the dict will be the list
In Python2.7
>>> L=[
... {'id':1,'name':'john', 'age':34},
... {'id':1,'name':'john', 'age':34},
... {'id':2,'name':'hanna', 'age':30},
... ]
>>> {v['id']:v for v in L}.values()
[{'age': 34, 'id': 1, 'name': 'john'}, {'age': 30, 'id': 2, 'name': 'hanna'}]
In Python3
>>> L=[
... {'id':1,'name':'john', 'age':34},
... {'id':1,'name':'john', 'age':34},
... {'id':2,'name':'hanna', 'age':30},
... ]
>>> list({v['id']:v for v in L}.values())
[{'age': 34, 'id': 1, 'name': 'john'}, {'age': 30, 'id': 2, 'name': 'hanna'}]
In Python2.5/2.6
>>> L=[
... {'id':1,'name':'john', 'age':34},
... {'id':1,'name':'john', 'age':34},
... {'id':2,'name':'hanna', 'age':30},
... ]
>>> dict((v['id'],v) for v in L).values()
[{'age': 34, 'id': 1, 'name': 'john'}, {'age': 30, 'id': 2, 'name': 'hanna'}]
getting unique values from a list of dict
import ast
l = [
{'x':'1','y':'1'},{'x':'2','y':'2'},{'x':'1','y':'1'}
]
[ast.literal_eval(el1) for el1 in set([str(el2) for el2 in l])]
Usually an easy solution for keeping unique elements is to add them to a set. However, since a dict is unhashable (can't be put in a set), I provided a workaround. First, the dicts are converted to strings, placed in a set (to keep the unique ones), and then converted back to dicts using ast.literal_eval.
uniqify a list of dictionaries
Use Mac Address,group-addr,vlan and ver as the key to group common elements, you should do this when you create the dict originally but this is an example using the data from your question :
foo = """Port Mac Address group-addr vlan ver
s2p2 0100.5e00.0004 239.0.0.4 1 1
s2p0 0100.5e00.0005 239.0.0.8 1 1
s2p1 0100.5e00.0004 239.0.0.4 1 1"""
from collections import defaultdict
d = defaultdict(set)
lines = foo.splitlines()
for line in lines[1:]:
prt,mc,gp,vl,vr = line.split()
d[(mc,gp,vl,vr)].add(prt)
print(d)
defaultdict(<type 'set'>, {('0100.5e00.0004', '239.0.0.4', '1', '1'): set(['s2p2', 's2p1']), ('0100.5e00.0005', '239.0.0.8', '1', '1'): set(['s2p0'])})
print "%s %10s %14s %15s" % ("Vlan", "Group", "Version", "Port List")
print "---------------------------------------------------------"
for mc, gp, vl, vr in d:
print("{:<10} {:<14} {:<15}".format(vl,gp,vr)) +",".join(d[mc, gp, vl, v])
Vlan Group Version Port List
---------------------------------------------------------
1 239.0.0.4 1 s2p2,s2p1
1 239.0.0.8 1 s2p0
How make unique a list of nested dictionaries in python
You'd need to track if you have seen a dictionary already. Unfortunately, dictionaries are not hashable, and do not track order, so you need to convert dictionaries to something that is hashable. A frozenset()
of the key-value pairs (as tuples) would do, but then you need to flatten recursively:
def set_from_dict(d):
return frozenset(
(k, set_from_dict(v) if isinstance(v, dict) else v)
for k, v in d.iteritems())
These frozenset()
objects represent the dictionary values enough to track unique items:
seen = set()
result = []
for d in inputlist:
representation = set_from_dict(d)
if representation in seen:
continue
result.append(d)
seen.add(representation)
This preserves the original order of your input list, minus duplicates. If you are using Python 2.7 and up, an OrderedDict
would have been helpful here, but you are using Python 2.6, so we need to do it slightly more verbosely.
The above approach takes O(N) time, one step per input dictionary, as testing against a set takes only O(1) constant time.
Demo:
>>> inputlist = [{'permission': 'full',
... 'permission_type': 'allow',
... 'trustee': {'id': 'SID:S-1-5-32-545',
... 'name': 'Users',
... 'type': 'group'}},
... {'permission': 'full',
... 'permission_type': 'allow',
... 'trustee': {'id': 'SID:S-1-5-32-545',
... 'name': 'Users',
... 'type': 'group'}},
... {'permission': 'full',
... 'permission_type': 'allow',
... 'trustee': {'id': 'SID:S-1-5-32-544',
... 'name': 'Administrators',
... 'type': 'group'}}]
>>> def set_from_dict(d):
... return frozenset(
... (k, set_from_dict(v) if isinstance(v, dict) else v)
... for k, v in d.iteritems())
...
>>> seen = set()
>>> result = []
>>> for d in inputlist:
... representation = set_from_dict(d)
... if representation in seen:
... continue
... result.append(d)
... seen.add(representation)
...
>>> from pprint import pprint
>>> pprint(result)
[{'permission': 'full',
'permission_type': 'allow',
'trustee': {'id': 'SID:S-1-5-32-545', 'name': 'Users', 'type': 'group'}},
{'permission': 'full',
'permission_type': 'allow',
'trustee': {'id': 'SID:S-1-5-32-544',
'name': 'Administrators',
'type': 'group'}}]
Group and aggregate a list of dictionaries by multiple keys
Using pure python, you can do insert into an OrderedDict
to retain insertion order:
from collections import OrderedDict
d = OrderedDict()
for l in lst:
d.setdefault((l['number'], l['favorite']), set()).add(l['color'])
[{'number': k[0], 'favorite': k[1], 'color': v.pop() if len(v) == 1 else v}
for k, v in d.items()]
# [{'color': {'green', 'red'}, 'favorite': False, 'number': 1},
# {'color': 'red', 'favorite': True, 'number': 1},
# {'color': 'red', 'favorite': False, 'number': 2}]
This can also be done quite easily using the pandas GroupBy
API:
import pandas as pd
d = (pd.DataFrame(lst)
.groupby(['number', 'favorite'])
.color
.agg(set)
.reset_index()
.to_dict('r'))
d
# [{'color': {'green', 'red'}, 'favorite': False, 'number': 1},
# {'color': {'red'}, 'favorite': True, 'number': 1},
# {'color': {'red'}, 'favorite': False, 'number': 2}]
If the condition of a string for a single element is required, you can use
[{'color': (lambda v: v.pop() if len(v) == 1 else v)(d_.pop('color')), **d_}
for d_ in d]
# [{'color': {'green', 'red'}, 'favorite': False, 'number': 1},
# {'color': 'red', 'favorite': True, 'number': 1},
# {'color': 'red', 'favorite': False, 'number': 2}]
Fastest way to uniqify a list in Python
set([a, b, c, a])
Leave it in that form if possible.
Related Topics
How to Display Last 2 Digits from a Number in Python
How to Select All Elements Greater Than a Given Values in a Dataframe
Delete Rows Containing Numeric Values in Strings from Pandas Dataframe
Print All Number Divisible by 7 and Contain 7 from 0 to 100
How to Convert Python Code to Application
Pandas Dataframe Calculations With Previous Row
How to Extract a Value (I Want an Int Not Row) from a Dataframe and Do Simple Calculations on It
How to Change Python Version in Command Prompt If I Have 2 Python Version Installed
Python - How to Pad the Output of a MySQL Table
How to Check List Containing Nan
Reduce Multi-Index/Multi-Level Dataframe to Single Index, Single Level
Check Json Data Is None in Python
How to Perform Union on Two Dataframes With Different Amounts of Columns in Spark
How to Continue a Loop After Catching Exception in Try ... Except
How to Count the Number of Messages
How to Get Max Output from a While Loop
Python: How to Find the First Day of Every Month Between Two Date Ranges