List of Unique Dictionaries

List of unique dictionaries

So make a temporary dict with the key being the id. This filters out the duplicates.
The values() of the dict will be the list

In Python2.7

>>> L=[
... {'id':1,'name':'john', 'age':34},
... {'id':1,'name':'john', 'age':34},
... {'id':2,'name':'hanna', 'age':30},
... ]
>>> {v['id']:v for v in L}.values()
[{'age': 34, 'id': 1, 'name': 'john'}, {'age': 30, 'id': 2, 'name': 'hanna'}]

In Python3

>>> L=[
... {'id':1,'name':'john', 'age':34},
... {'id':1,'name':'john', 'age':34},
... {'id':2,'name':'hanna', 'age':30},
... ]
>>> list({v['id']:v for v in L}.values())
[{'age': 34, 'id': 1, 'name': 'john'}, {'age': 30, 'id': 2, 'name': 'hanna'}]

In Python2.5/2.6

>>> L=[
... {'id':1,'name':'john', 'age':34},
... {'id':1,'name':'john', 'age':34},
... {'id':2,'name':'hanna', 'age':30},
... ]
>>> dict((v['id'],v) for v in L).values()
[{'age': 34, 'id': 1, 'name': 'john'}, {'age': 30, 'id': 2, 'name': 'hanna'}]

how to uniqify a list of dict in python

If your value is hashable this will work:

>>> [dict(y) for y in set(tuple(x.items()) for x in d)]
[{'y': 4, 'x': 3}, {'y': 2, 'x': 1}]

EDIT:

I tried it with no duplicates and it seemed to work fine

>>> d = [{'x':1, 'y':2}, {'x':3, 'y':4}]
>>> [dict(y) for y in set(tuple(x.items()) for x in d)]
[{'y': 4, 'x': 3}, {'y': 2, 'x': 1}]

and

>>> d = [{'x':1,'y':2}]
>>> [dict(y) for y in set(tuple(x.items()) for x in d)]
[{'y': 2, 'x': 1}]

Get list of unique dictionaries while accumulating the count of their attributes

Here is some code that does not use any python library. This does however cause the code to be longer.

duplicate_array= [
{'id': 1, 'name': 'john', 'count': 1},
{'id': 1, 'name': 'john', 'count': 2},
{'id': 2, 'name': 'peter', 'count': 1},
]
final=[]

for i, x in enumerate(duplicate_array):
count = 0

for d in duplicate_array.copy():
if d != 0 and d["id"] == x["id"] and d["name"] == x["name"]:
count += d["count"]
duplicate_array.remove(d)

duplicate_array.insert(i, 0)
x["count"] = count
final.append(x)

In the first block of code, whe are defining the original list and initialising our output list.

Then we have the for loop.

First, we are initialising count to 0. Then we loop through the list again to find all dictionaries which have the same id and name as the current dictionary. If they do, we add count up with their count value and remove them from the list. We also check whether the dictionary is nonzero, because we are adding zeros to the array later on. This prevents the program from crashing.

We insert a zero at the current position in the list, in order to prevent python from skipping the next item. For loops keep up a counter for at which item they are in python. However, when we are deleting the current item (which we did in the nested for loop), this counter won't match the correct item anymore, because all next items are shifted one to the left. By inserting a zero in the original list, we shift all items back and make the index correct again.

Finally, we set the original dictionary's count to the value we just calculated and we append the unique dictionary to our final list.

After this code, duplicate_array will be filled with zeros. If you don't want this, you can copy the list with duplicate_array.copy() first.

Unique dictionaries out of a list of lists?

You could use a list comprehension, but depending on your Python version, using an collections.OrderedDict object with a generator expression to flatten the matrix would actually be more efficient.

When your values are not hashable and thus can't be stored in a set or dictionary, you'll have to use first create an immutable representation, so we can store that representation in a set or dictionary to efficiently track uniqueness.

For dictionaries that are flat structures with all keys and values immutable, just use tuple(sorted(d.items())). This produces a tuple of all (key, value) pairs (also tuples), in sorted order to avoid dictionary order issues.

On Python 3.5 and up, use an OrderedDict() that maps the immutable keys to original dictionaries:

from collections import OrderedDict

key = lambda d: tuple(sorted(d.items()))

dictionaries = list(OrderedDict((key(v), v) for row in matrix for v in row).values())

On Python 3.4 and earlier, OrderedDict is slow and you'd be beter of using a separate set approach for Python 3.4 and below:

key = lambda d: tuple(sorted(d.items()))
seen = set()
seen_add = seen.add
dictionaries = [
v for row in matrix
for k, v in ((key(v), v) for v in row)
if not (k in seen or seen_add(k))]

Quick demo using your input data and an OrderedDict:

>>> from collections import OrderedDict
>>> row1 = [{'NODE':1}, {'NODE':2}, {'NODE':3}]
>>> row2 = [{'NODE':3}, {'NODE':4}, {'NODE':5}]
>>> row3 = [{'NODE':4}, {'NODE':6}, {'NODE':7}]
>>> matrix = [row1, row2, row3]
>>> key = lambda d: tuple(sorted(d.items()))
>>> list(OrderedDict((key(v), v) for row in matrix for v in row).values())
[{'NODE': 1}, {'NODE': 2}, {'NODE': 3}, {'NODE': 4}, {'NODE': 5}, {'NODE': 6}, {'NODE': 7}]

Get unique keys and their unique values in a list of nested dictionaries

Another solution:

def solution(data: list[dict]):
result = dict()
for d in data:
collect_keys_and_values(d, result)
return result

def collect_keys_and_values(data: dict, result: dict):
for key, value in data.items():
coll = result.setdefault(key, [])
if value not in coll:
coll.append(value)
if isinstance(value, dict):
collect_keys_and_values(value, result)

def main():
print(solution([{
"key1": {"subkey1": "subvalue1", "subkey2": "subvalue2"},
"key2": "value2",
"key3": {"subkey3": "subvalue2"}
}, {
"key4": "value4",
"subkey1": "other_value",
"key2": "value2"
}]))

if __name__ == '__main__':
main()

How to get unique values from a list of dictionaries?

Like this:

unique_years = set([x['year'] for x in dicts])

Ansible get unique list of dictionaries based on a key

You could use rejectattr in order to reject all the id contained in a list populated from list2. This later list can be created using map to extract only the id from list2.

All this together gives this simple task:

- set_fact:
list3: "{{ list1 | rejectattr('id', 'in', list2 | map(attribute='id')) }}"

Given the playbook:

- hosts: localhost
gather_facts: no

tasks:
- set_fact:
list3: "{{ list1 | rejectattr('id', 'in', list2 | map(attribute='id')) }}"
vars:
list1: [ {'a': 'name1', 'id': 'ABC'}, {'ax': 'name2', 'id': 'DEF'} ]
list2: [ {'a': 'nameX', 'id': 'XYZ'}, {'ab': 'nameY', 'id': 'DEF'} ]

- debug:
var: list3

This yields:

TASK [set_fact] ********************************************************
ok: [localhost]

TASK [debug] ***********************************************************
ok: [localhost] => {
"list3": [
{
"a": "name1",
"id": "ABC"
}
]
}


Related Topics



Leave a reply



Submit