Remove Duplicate Dict in List in Python

Remove duplicate dict in list in Python

Try this:

[dict(t) for t in {tuple(d.items()) for d in l}]

The strategy is to convert the list of dictionaries to a list of tuples where the tuples contain the items of the dictionary. Since the tuples can be hashed, you can remove duplicates using set (using a set comprehension here, older python alternative would be set(tuple(d.items()) for d in l)) and, after that, re-create the dictionaries from tuples with dict.

where:

  • l is the original list
  • d is one of the dictionaries in the list
  • t is one of the tuples created from a dictionary

Edit: If you want to preserve ordering, the one-liner above won't work since set won't do that. However, with a few lines of code, you can also do that:

l = [{'a': 123, 'b': 1234},
{'a': 3222, 'b': 1234},
{'a': 123, 'b': 1234}]

seen = set()
new_l = []
for d in l:
t = tuple(d.items())
if t not in seen:
seen.add(t)
new_l.append(d)

print new_l

Example output:

[{'a': 123, 'b': 1234}, {'a': 3222, 'b': 1234}]

Note: As pointed out by @alexis it might happen that two dictionaries with the same keys and values, don't result in the same tuple. That could happen if they go through a different adding/removing keys history. If that's the case for your problem, then consider sorting d.items() as he suggests.

How to remove duplicate dict from list in python

As @mugiseyebrows said, we use the 'Name' of each dictionary (this statement is not very rigorous.) as the key and the dictionary itself as the value to create a new dictionary so that you can ensure that a dictionary with the same 'Name' appears once, and then use its values to create a new list:

>>> new_dict = {dct['Name']: dct for dct in objList}
>>> new_list = list(new_dict.values())
>>> print('},\n'.join(str(new_list).split('},')))
[{'Name': 'plate', 'StartTime': '2022-05-17T10:26:05.738101'},
{'Name': 'bezel', 'StartTime': '2022-05-17T10:26:09.922667'},
{'Name': 'chrome', 'StartTime': '2022-05-17T10:26:23.283304'},
{'Name': 'plate placement', 'StartTime': '2022-05-17T10:26:39.3390'}]

How to remove duplicate values from list of dicts and keep original order?

You can create a dictionary where the key is the string representation of the items in your list, and the value is the actual item.

time_array_final = [{'day': 15, 'month': 5},{'day': 29, 'month': 5}, {'day': 10, 'month': 6}, {'day': 10, 'month': 6}, {'day': 10, 'month': 6}, {'day': 10, 'month': 6}, {'day': 12, 'month': 6}, {'day': 12, 'month': 6}, {'day': 12, 'month': 6}, {'day': 12, 'month': 6}, {'day': 12, 'month': 6}, {'day': 12, 'month': 6}, {'day': 12, 'month': 6}, {'day': 12, 'month': 6}, {'day': 12, 'month': 6}, {'day': 12, 'month': 6}, {'day': 12, 'month': 6}, {'day': 12, 'month': 6}, {'day': 12, 'month': 6}, {'day': 12, 'month': 6}, {'day': 12, 'month': 6}, {'day': 12, 'month': 6}, {'day': 14, 'month': 6},{'day': 15, 'month': 6}, {'day': 15, 'month': 6}, {'day': 15, 'month': 6}]

dedupe_dict = {str(item): item for item in time_array_final}

Upon encountering a duplicate item, the dict comprehension will overwrite the previous item with the duplicate one, but that doesn't make any material difference because both items are identical.

Since python 3.6, dictionaries keep insertion order, so dict.values() should give you the output you need.

deduped_list = list(dedupe_dict.values())

Which gives:

[{'day': 15, 'month': 5},
{'day': 29, 'month': 5},
{'day': 10, 'month': 6},
{'day': 12, 'month': 6},
{'day': 14, 'month': 6},
{'day': 15, 'month': 6}]

As noted by @Copperfield in their comments on another answer, str(dict) is not the most reliable way of stringifying dicts for comparison, because the order of keys matters.

d1 = {'day': 1, 'month': 2}
d2 = {'month': 2, 'day': 1}

d1 == d2 # True
str(d1) == str(d2) # False

To get around this, you could create a frozenset of the dict.items(), and use that as your key (provided all the values in your dict are hashable) like so:

dedupe_dict = {frozenset(d.items()): d for d in time_array_final}

How to remove duplicate values in a dictionary?

Dictionaries by default don't accept duplicates. Look closely, you have dict[str:list[dict[...]] there. So you want to filter duplicated dictionaries from list.

Answer:
If order doesn't matter, and you want to remove only exact duplicates, just go this way:

structure = ... # your `dict` - you shouldn't use keyword as a variable name.
structure = {key: list(frozenset(tuple(d.items()) for d in value)) for key, value in st.items()}

If order matters, replace 'list' with 'sorted' and define key to sort it right.

@Edit: If performance is also a thing, you shouldn't take that task yet, you're lack of basics, and introducing proper structure is kind of overkill.

In case of any nested 'list_comprehension', I would highly discourage that, as it'll most likely perform even more poorly than my example.

@Edit2:
Of course, if you want reverse it to proper structure, you can play with that further.

@Edit3:
example execution

Removing duplicate keys from list of dictionary, keep only that key-value where value is maximum

You can use a dict comprehension with itertools.groupby:

from itertools import groupby

mylist = [{'x': 2020, 'y': 20}, {'x': 2020, 'y': 30}, {'x': 2021, 'y': 10}, {'x': 2021, 'y': 5}]

mylist_unique = [{'x': key, 'y': max(item['y'] for item in values)}
for key, values in groupby(mylist, lambda dct: dct['x'])]
print(mylist_unique)

This yields

[{'x': 2020, 'y': 30}, {'x': 2021, 'y': 10}]

python remove duplicate dictionaries from a list

Something like this should do the stuff :

result = [dict(tupleized) for tupleized in set(tuple(item.items()) for item in l)]

first, I transform the inital dict in a list of tuples, then I put them into a set (that removes duplicates entries), and then back into a dict.

remove duplicate dictionary python

We could create dictionary that has "questionid", "text" and "score" tuple as key and dicts as values and use this dictionary to check for duplicate values in data:

from operator import itemgetter
out = {}
for d in data:
key = itemgetter("questionid", "text", "score")(d)
if key not in out:
out[key] = d
out = list(out.values())

Output:

[{'id': 34, 'questionid': 5, 'text': 'yes', 'score': 1},
{'id': 10, 'questionid': 5, 'text': 'test answer updated', 'score': 2},
{'id': 20, 'questionid': 5, 'text': 'no', 'score': 0}]

remove the duplicate key and values in list of dictionaries and append unique list of values to a key

@U12-Forward's answer works only if the input is pre-sorted, with records of the same f_ids already grouped together.

A better-rounded approach that works regardless of the order of the input would be to build a dict that maps f_ids to respective dicts, but convert the record_id value to a list when there are multiple records with the same f_ids:

mapping = {}
for d in list_of_dict:
try:
entry = mapping[d['f_id']] # raises KeyError
entry['record_id'].append(d['record_id']) # raises AttributeError
except KeyError:
mapping[d['f_id']] = d
except AttributeError:
entry['record_id'] = [entry['record_id'], d['record_id']]
print(list(mapping.values()))

This outputs:

[{'f_text': 'sample', 'symbol': '*', 'f_id': 246, 'record_id': ['4679', '4680'], 'flag': 'N'}, {'f_text': 'other text', 'symbol': '!#', 'f_id': 247, 'record_id': '4678', 'flag': 'N'}]

How to remove duplicate dictionary Objects from list

Use pandas, if it doesn't have nested dict

import pandas as pd
list = pd.DataFrame(list).drop_duplicates().to_dict(orient='records')


Related Topics



Leave a reply



Submit