Find Similar List Value Inside Dictionary

Iterate through list of dictionary and identify similar values in dictionary in Python

orders = [{
    'name': 'User_ORDERS1234',
    'expressions': [{'exp': '"table"."ORDERS"."STATUS" IN (\'Canceled\',\'Pending\')'}],
    'users': ['User_2']
},{
    'name': 'User_ORDERS1235',
    'expressions': [{'exp': '"table"."ORDERS"."STATUS"  = \'Shipped\''}],
    'users': ['User_1']
},{
    'name': 'User_ORDERS1236',
    'expressions': [{'exp': '"table"."ORDERS"."STATUS" IN (\'Canceled\',\'Pending\')'}],
    'users': ['User_3']
}]

for i, order in enumerate(orders):                # loop trough orders:
    exp1 = order['expressions']                   # 'exp' value of the order

    for next_order in orders[i+1:]:               # loop through the next orders:
        exp2 = next_order['expressions']          # 'exp' value of a next order

        if exp1 == exp2:                          # if the 'exp' values are the same:
            order['users'] += next_order['users'] # add the 'users' to the order 'users'
            next_order['users'] = []              # remove users from the next order

orders = [o for o in orders if o['users']]        # leave only the orders that have 'users'

print(orders)

Output

[{
    'name': 'User_ORDERS1234',
    'expressions': [{'exp': '"table"."ORDERS"."STATUS" IN (\'Canceled\',\'Pending\')'}],
    'users': ['User_2', 'User_3']
},{
    'name': 'User_ORDERS1235',
    'expressions': [{'exp': '"table"."ORDERS"."STATUS"  = \'Shipped\''}],
    'users': ['User_1']
}]

Find duplicate values in list of dictionaries

Some solutions and a benchmark.

Solutions

Fun with a dict, going forwards to get the order and backwards to get the first value.

lst_out = list({d['Second']: d
                for s in [1, -1]
                for d in lst_in[::s]}.values())

Or using setdefault to keep track of each value's first dict:

tmp = {}
for d in lst_in:
    tmp.setdefault(d['Second'], d)
lst_out = list(tmp.values())

Fun and potentially faster version:

add = {}.setdefault
for d in lst_in:
    add(d['Second'], d)
lst_out = list(add.__self__.values())

Benchmark

Times for a list of 1000 dicts with 100 different Second values (using Python 3.10.0):

 361 μs   362 μs   364 μs  dict_forward_backward
 295 μs   297 μs   297 μs  dict_setdefault
 231 μs   231 μs   232 μs  dict_setdefault_optimized
 196 μs   196 μs   197 μs  set_in_list_comprehension
 190 μs   190 μs   190 μs  set_in_list_comprehension_optimized
 191 μs   191 μs   191 μs  set_in_list_comprehension_optimized_2
 201 μs   201 μs   201 μs  set_with_loop
1747 μs  1751 μs  1774 μs  with_lists

Benchmark code:

from timeit import repeat, default_timer as timer
from random import choices

lst_in = [{'First': i, 'Second': v}
          for i, v in enumerate(choices(range(100), k=1000))]

def dict_forward_backward(lst_in):
    return list({d['Second']: d
                 for s in [1, -1]
                 for d in lst_in[::s]}.values())

def dict_setdefault(lst_in):
    tmp = {}
    for d in lst_in:
        tmp.setdefault(d['Second'], d)
    return list(tmp.values())

def dict_setdefault_optimized(lst_in):
    add = {}.setdefault
    for d in lst_in:
        add(d['Second'], d)
    return list(add.__self__.values())

def set_in_list_comprehension(lst_in):
    return [s.add(v) or d
            for s in [set()]
            for d in lst_in
            for v in [d['Second']]
            if v not in s]

def set_in_list_comprehension_optimized(lst_in):
    return [add(v) or d
            for s in [set()]
            for add in [s.add]
            for d in lst_in
            for v in [d['Second']]
            if v not in s]

def set_in_list_comprehension_optimized_2(lst_in):
    s = set()
    add = s.add
    return [add(v) or d
            for d in lst_in
            for v in [d['Second']]
            if v not in s]

def set_with_loop(lst_in):
    found = set()
    lst_out = []
    for dct in lst_in:
        if dct['Second'] not in found:
            lst_out.append(dct)
            found.add( dct['Second'] )
    return lst_out

def with_lists(lst_in):
    out = {'keep':[], 'counter':[]}
    for dct in lst_in:
        if dct['Second'] not in out['counter']:
            out['keep'].append(dct)
            out['counter'].append(dct['Second'])
    return out['keep']

funcs = [
    dict_forward_backward,
    dict_setdefault,
    dict_setdefault_optimized,
    set_in_list_comprehension,
    set_in_list_comprehension_optimized,
    set_in_list_comprehension_optimized_2,
    set_with_loop,
    with_lists,
]

# Correctness
expect = funcs[0](lst_in)
for func in funcs[1:]:
    result = func(lst_in)
    print(result == expect, func.__name__)
print()

# Speed
for _ in range(3):
    for func in funcs:
        ts = sorted(repeat(lambda: func(lst_in), 'gc.enable(); gc.collect()', number=1000))[:3]
        print(*('%4d μs ' % (t * 1e3) for t in ts), func.__name__)
    print()

Filter a dictionary of lists

I solved it with this:

from typing import Dict, List, Any, Set

d = {"level":[1,2,3], "conf":[-1,1,2], "text":["-1", "hel", "llo"]}

# First, we create a set that stores the indices which should be kept.
# I chose a set instead of a list because it has a O(1) lookup time.
# We only want to keep the items on indices where the value in d["conf"] is greater than 0
filtered_indexes = {i for i, value in enumerate(d.get('conf', [])) if value > 0}

def filter_dictionary(d: Dict[str, List[Any]], filtered_indexes: Set[int]) -> Dict[str, List[Any]]:
    filtered_dictionary = d.copy()  # We'll return a modified copy of the original dictionary
    for key, list_values in d.items():
        # In the next line the actual filtering for each key/value pair takes place. 
        # The original lists get overwritten with the filtered lists.
        filtered_dictionary[key] = [value for i, value in enumerate(list_values) if i in filtered_indexes]
    return filtered_dictionary

print(filter_dictionary(d, filtered_indexes))

Output:

{'level': [2, 3], 'conf': [1, 2], 'text': ['hel', 'llo']}

How to get values from a list of dictionaries, which themselves contain lists of dictionaries in Python

First: dict's require a key-value association for every element in the dictionary. Your 2nd level data structure though does not include keys: ({[{'tag': 'tag 1'}]}) This is a set. Unlike dict's, set's do not have keys associated with their elements. So your data structure looks like List[Set[List[Dict[str, str]]]].

Second: when I try to run

# python 3.8.8
player_info = [{[{'tag': 'tag 1'}]},
               {[{'tag': 'tag 2'}]}]

I recieve the error TypeError: unhashable type: 'list'. That's because you're code attempts to contain a list inside a set. Set membership in python demands the members to be hashable. However, you will not find a __hash__() function defined on list objects. Even if you resolve this by replacing the list with a tuple, you will find that dict objects are not hashable either. Potential solutions include using immutable objects like frozendict or tuple, but that is another post.

To answer your question, I have reformulated your problem as

player_info = [[[{'tag': 'tag 1'}]],
               [[{'tag': 'tag 2'}]]]

and compared the performance difference with A) explicit loops:

for i in range(len(player_info)):
  print(player_info[i][0][0]['tag'])

against B) list comprehension

[
  print(single_player_info[0][0]['tag']) 
  for single_player_info in player_info
]

Running the above code blocks in jupyter with the %%timeit cell magic, I got:
A) 154 µs ± 14.6 µs per loop (mean ± std. dev. of 7 runs, 10,000 loops each) and
B) 120 µs ± 11 µs per loop (mean ± std. dev. of 7 runs, 10,000 loops each)

Note: This experiment is highly skewed for at least two reasons:

I tested both trials using only the data you provided (N=2). It is very likely that we would observe different scaling behaviors than initial conditions suggest.
print consumes a lot of time and makes this problem heavily subject to the status of the kernel

I hope this answers your question.