Removing Duplicates from Dictionary

Removing Duplicates From Dictionary

You could go though each of the items (the key value pair) in the dictionary and add them into a result dictionary if the value was not already in the result dictionary.

input_raw = {112762853378: 
{'dst': ['10.121.4.136'],
'src': ['1.2.3.4'],
'alias': ['www.example.com']
},
112762853385:
{'dst': ['10.121.4.136'],
'src': ['1.2.3.4'],
'alias': ['www.example.com']
},
112760496444:
{'dst': ['10.121.4.136'],
'src': ['1.2.3.4']
},
112760496502:
{'dst': ['10.122.195.34'],
'src': ['4.3.2.1']
}
}

result = {}

for key,value in input_raw.items():
if value not in result.values():
result[key] = value

print result

How to remove duplicate values in a dictionary?

Dictionaries by default don't accept duplicates. Look closely, you have dict[str:list[dict[...]] there. So you want to filter duplicated dictionaries from list.

Answer:
If order doesn't matter, and you want to remove only exact duplicates, just go this way:

structure = ... # your `dict` - you shouldn't use keyword as a variable name.
structure = {key: list(frozenset(tuple(d.items()) for d in value)) for key, value in st.items()}

If order matters, replace 'list' with 'sorted' and define key to sort it right.

@Edit: If performance is also a thing, you shouldn't take that task yet, you're lack of basics, and introducing proper structure is kind of overkill.

In case of any nested 'list_comprehension', I would highly discourage that, as it'll most likely perform even more poorly than my example.

@Edit2:
Of course, if you want reverse it to proper structure, you can play with that further.

@Edit3:
example execution

Remove duplicate dict in list in Python

Try this:

[dict(t) for t in {tuple(d.items()) for d in l}]

The strategy is to convert the list of dictionaries to a list of tuples where the tuples contain the items of the dictionary. Since the tuples can be hashed, you can remove duplicates using set (using a set comprehension here, older python alternative would be set(tuple(d.items()) for d in l)) and, after that, re-create the dictionaries from tuples with dict.

where:

  • l is the original list
  • d is one of the dictionaries in the list
  • t is one of the tuples created from a dictionary

Edit: If you want to preserve ordering, the one-liner above won't work since set won't do that. However, with a few lines of code, you can also do that:

l = [{'a': 123, 'b': 1234},
{'a': 3222, 'b': 1234},
{'a': 123, 'b': 1234}]

seen = set()
new_l = []
for d in l:
t = tuple(d.items())
if t not in seen:
seen.add(t)
new_l.append(d)

print new_l

Example output:

[{'a': 123, 'b': 1234}, {'a': 3222, 'b': 1234}]

Note: As pointed out by @alexis it might happen that two dictionaries with the same keys and values, don't result in the same tuple. That could happen if they go through a different adding/removing keys history. If that's the case for your problem, then consider sorting d.items() as he suggests.

Removing duplicate keys from list of dictionary, keep only that key-value where value is maximum

You can use a dict comprehension with itertools.groupby:

from itertools import groupby

mylist = [{'x': 2020, 'y': 20}, {'x': 2020, 'y': 30}, {'x': 2021, 'y': 10}, {'x': 2021, 'y': 5}]

mylist_unique = [{'x': key, 'y': max(item['y'] for item in values)}
for key, values in groupby(mylist, lambda dct: dct['x'])]
print(mylist_unique)

This yields

[{'x': 2020, 'y': 30}, {'x': 2021, 'y': 10}]

remove duplicate values from items in a dictionary in Python

This problem essentially boils down to removing duplicates from a list of unhashable types, for which converting to a set does not possible.

One possible method is to check for membership in the current value while building up a new list value.

d = {'word': [('769817', [6]), ('769819', [4, 10]), ('769819', [4, 10])]}
for k, v in d.items():
new_list = []
for item in v:
if item not in new_list:
new_list.append(item)
d[k] = new_list

Alternatively, use groupby() for a more concise answer, although potentially slower (the list must be sorted first, if it is, then it is faster than doing a membership check).

import itertools

d = {'word': [('769817', [6]), ('769819', [4, 10]), ('769819', [4, 10])]}
for k, v in d.items():
v.sort()
d[k] = [item for item, _ in itertools.groupby(v)]

Output -> {'word': [('769817', [6]), ('769819', [4, 10])]}

Remove duplicate from dictionary in python

You can make a set to store items that have been seen, and then sequentially update the dict according to the set:

d = {1:[0,1,2,3], 2:[1,4,5], 3:[0,4,2,5,6], 4:[0,2,7,8], 5:[9]}

seen = set()
for k, v in d.items():
d[k] = [x for x in v if x not in seen]
seen.update(d[k])

print(d) # {1: [0, 1, 2, 3], 2: [4, 5], 3: [6], 4: [7, 8], 5: [9]}

remove duplicate dictionary python

We could create dictionary that has "questionid", "text" and "score" tuple as key and dicts as values and use this dictionary to check for duplicate values in data:

from operator import itemgetter
out = {}
for d in data:
key = itemgetter("questionid", "text", "score")(d)
if key not in out:
out[key] = d
out = list(out.values())

Output:

[{'id': 34, 'questionid': 5, 'text': 'yes', 'score': 1},
{'id': 10, 'questionid': 5, 'text': 'test answer updated', 'score': 2},
{'id': 20, 'questionid': 5, 'text': 'no', 'score': 0}]


Related Topics



Leave a reply



Submit