Python -Intersection of Multiple Lists

Creating multiple lists

What you can do is use a dictionary:

>>> obj = {}
>>> for i in range(1, 21):
... obj['l'+str(i)] = []
...
>>> obj
{'l18': [], 'l19': [], 'l20': [], 'l14': [], 'l15': [], 'l16': [], 'l17': [], 'l10': [], 'l11': [], 'l12': [], 'l13': [], 'l6': [], 'l7': [], 'l4': [], 'l5': [], 'l2': [], 'l3': [], 'l1': [], 'l8': [], 'l9': []}
>>>

You can also create a list of lists using list comprehension:

>>> obj = [[] for i in range(20)]
>>> obj
[[], [], [], [], [], [], [], [], [], [], [], [], [], [], [], [], [], [], [], []]
>>>

Combinations of multiple lists

As an alternative to regenerating the list of combinations, compute the product of the combinations up front; this also saves you from nesting for loops.

from itertools import combinations, product

list1 = list("abcdefgh")
list2 = list("ijk")
list3 = list("lmnop")

l1 = combinations(list1, 5)
l2 = combinations(list2, 2)
l3 = combinations(list3, 3)
for c1, c2, c3 in product(l1, l2, l3):
sample = c1 + c2 + c3
print(sample)

How to cross-identify elements from multiple lists

Here is an answer with python 3 since this seems to be the language you are using.

for i in a iterates over the items in a. What you are looking for is iterate over the indices of its items (from 0 to len(a) - 1):

>>> for i in range(len(a)):
if a[i] == b[i] : print(i, "=>", a[i])
4 => 5
5 => 6

Python also has some functional programming abilities. Here is a variation with list comprehension:

>>> common_indices = [ i for i in range(len(a)) if a[i] == b[i] ]
>>> for i in common_indices: print(i, "=>", a[i])
4 => 5
5 => 6

Similarly, python has dict comprehension. Here is a variation with dict comprehension:

>>> common_values = { i: a[i] for i in range(len(a)) if a[i] == b[i] }
>>> for i, v in common_values.items(): print(i, "=>", v)
4 => 5
5 => 6

Apply if statement on multiple lists with multiple conditions

Here you go:

I changed the list names to something more descriptive.

output = []
area_counts = [4, 4, 4, 4, 1, 6, 7, 8, 9, 6, 10, 11]
area_numbers = [1, 1, 1, 4, 5, 6, 7, 8, 9, 10, 10, 10]
ids = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]
distances = [2, 2, 2, 4, 5, 6, 7.2, 5, 5, 5, 8.5, 9.1]

temp_numbers, temp_ids = [], []
for count, number, id, distance in zip(counts, numbers, ids, distances):
if count >= 5 and distance >= 3:
temp_numbers.append(number)
temp_ids.append(id)

for (number, id) in zip(temp_numbers, temp_ids):
if temp_numbers.count(number) == 3:
output.append(id)

output will be:

[10, 11, 12]

Sorting multiple lists together in place

I think "without creating temporary objects" is impossible, especially since "everything is an object" in Python.

You could get O(1) space / number of objects if you implement some sorting algorithm yourself, though if you want O(n log n) time and stability, it's difficult. If you don't care about stability (seems likely, since you say you want to sort by a but then actually sort by a, b and c), heapsort is reasonably easy:

def sort_together_heapsort(a, b, c):
n = len(a)
def swap(i, j):
a[i], a[j] = a[j], a[i]
b[i], b[j] = b[j], b[i]
c[i], c[j] = c[j], c[i]
def siftdown(i):
while (kid := 2*i+1) < n:
imax = kid if a[kid] > a[i] else i
kid += 1
if kid < n and a[kid] > a[imax]:
imax = kid
if imax == i:
return
swap(i, imax)
i = imax
for i in range(n // 2)[::-1]:
siftdown(i)
while n := n - 1:
swap(0, n)
siftdown(0)

Anyway, if someone's interested in just saving some amount of memory, that can be done by decorating in-place (building tuples and storing them in a):

def sort_together_decorate_in_a(a, b, c):
for i, a[i] in enumerate(zip(a, b, c)):
pass
a.sort()
for i, [a[i], b[i], c[i]] in enumerate(a):
pass

Or if you trust that list.sort will ask for keys for the elements in order (at least in CPython it does, already did so when the key parameter was introduced 18 years ago, and I suspect will keep doing so):

def sort_together_iter_key(a, b, c):
it = iter(a)
b.sort(key=lambda _: next(it))
it = iter(a)
c.sort(key=lambda _: next(it))
a.sort()

Testing memory and time with three lists of 100,000 elements:

15,072,520 bytes   152 ms  sort_together_sorted_zip
15,072,320 bytes 166 ms sort_together_sorted_zip_2
14,272,576 bytes 152 ms sort_together_sorted_zip_X
6,670,708 bytes 126 ms sort_together_decorate_in_a
6,670,772 bytes 177 ms sort_together_decorate_in_first_X
5,190,212 bytes 342 ms sort_multi_by_a_guest_X
1,597,400 bytes 100 ms sort_together_iter_key
1,597,448 bytes 102 ms sort_together_iter_key_X
744 bytes 1584 ms sort_together_heapsort
704 bytes 1663 ms sort_together_heapsort_X
168 bytes 1326 ms sort_together_heapsort_opti
188 bytes 1512 ms sort_together_heapsort_opti_X

Note:

  • The second solution is a shortened/improved version of yours, no need for temporary variables and conversions to lists.
  • The solutions with _X suffix are versions that take arbitrarily many lists as parameters.
  • The @a_guest is from their answer. Runtime-wise it currently benefits from my data being random, as that doesn't expose that solution's worst case complexity O(m * n²), where m is the number of lists and n is the length of each list.

Testing memory and time with ten lists of 100,000 elements:

19,760,808 bytes   388 ms  sort_together_sorted_zip_X
12,159,100 bytes 425 ms sort_together_decorate_in_first_X
5,190,292 bytes 1249 ms sort_multi_by_a_guest_X
1,597,528 bytes 393 ms sort_together_iter_key_X
704 bytes 4186 ms sort_together_heapsort_X
188 bytes 4032 ms sort_together_heapsort_opti_X

The whole code (Try it online!):

import tracemalloc as tm
from random import random
from timeit import timeit

def sort_together_sorted_zip(a, b, c):
a_sorted, b_sorted, c_sorted = map(list, zip(*sorted(zip(a, b, c))))
a[:] = a_sorted
b[:] = b_sorted
c[:] = c_sorted

def sort_together_sorted_zip_2(a, b, c):
a[:], b[:], c[:] = zip(*sorted(zip(a, b, c)))

def sort_together_sorted_zip_X(*lists):
sorteds = zip(*sorted(zip(*lists)))
for lst, lst[:] in zip(lists, sorteds):
pass

def sort_together_decorate_in_a(a, b, c):
for i, a[i] in enumerate(zip(a, b, c)):
pass
a.sort()
for i, [a[i], b[i], c[i]] in enumerate(a):
pass

def sort_together_decorate_in_first_X(*lists):
first = lists[0]
for i, first[i] in enumerate(zip(*lists)):
pass
first.sort()
for i, values in enumerate(first):
for lst, lst[i] in zip(lists, values):
pass

def sort_together_iter_key(a, b, c):
it = iter(a)
b.sort(key=lambda _: next(it))
it = iter(a)
c.sort(key=lambda _: next(it))
a.sort()

def sort_together_iter_key_X(*lists):
for lst in lists[1:]:
it = iter(lists[0])
lst.sort(key=lambda _: next(it))
lists[0].sort()

def sort_together_heapsort(a, b, c):
n = len(a)
def swap(i, j):
a[i], a[j] = a[j], a[i]
b[i], b[j] = b[j], b[i]
c[i], c[j] = c[j], c[i]
def siftdown(i):
while (kid := 2*i+1) < n:
imax = kid if a[kid] > a[i] else i
kid += 1
if kid < n and a[kid] > a[imax]:
imax = kid
if imax == i:
return
swap(i, imax)
i = imax
for i in range(n // 2)[::-1]:
siftdown(i)
while n := n - 1:
swap(0, n)
siftdown(0)

def sort_together_heapsort_X(*lists):
a = lists[0]
n = len(a)
def swap(i, j):
for lst in lists:
lst[i], lst[j] = lst[j], lst[i]
def siftdown(i):
while (kid := 2*i+1) < n:
imax = kid if a[kid] > a[i] else i
kid += 1
if kid < n and a[kid] > a[imax]:
imax = kid
if imax == i:
return
swap(i, imax)
i = imax
for i in range(n // 2)[::-1]:
siftdown(i)
while n := n - 1:
swap(0, n)
siftdown(0)

def sort_together_heapsort_opti(a, b, c):
# Avoid inner functions and range-loop to minimize memory.
# Makes it faster, too. But duplicates code. Not recommended.
n = len(a)
i0 = n // 2 - 1
while i0 >= 0:
i = i0
while (kid := 2*i+1) < n:
imax = kid if a[kid] > a[i] else i
kid += 1
if kid < n and a[kid] > a[imax]:
imax = kid
if imax == i:
break
a[i], a[imax] = a[imax], a[i]
b[i], b[imax] = b[imax], b[i]
c[i], c[imax] = c[imax], c[i]
i = imax
i0 -= 1
while n := n - 1:
a[0], a[n] = a[n], a[0]
b[0], b[n] = b[n], b[0]
c[0], c[n] = c[n], c[0]
i = 0
while (kid := 2*i+1) < n:
imax = kid if a[kid] > a[i] else i
kid += 1
if kid < n and a[kid] > a[imax]:
imax = kid
if imax == i:
break
a[i], a[imax] = a[imax], a[i]
b[i], b[imax] = b[imax], b[i]
c[i], c[imax] = c[imax], c[i]
i = imax

def sort_together_heapsort_opti_X(*lists):
# Avoid inner functions and range-loop to minimize memory.
# Makes it faster, too. But duplicates code. Not recommended.
a = lists[0]
n = len(a)
i0 = n // 2 - 1
while i0 >= 0:
i = i0
while (kid := 2*i+1) < n:
imax = kid if a[kid] > a[i] else i
kid += 1
if kid < n and a[kid] > a[imax]:
imax = kid
if imax == i:
break
for lst in lists:
lst[i], lst[imax] = lst[imax], lst[i]
i = imax
i0 -= 1
while n := n - 1:
for lst in lists:
lst[0], lst[n] = lst[n], lst[0]
i = 0
while (kid := 2*i+1) < n:
imax = kid if a[kid] > a[i] else i
kid += 1
if kid < n and a[kid] > a[imax]:
imax = kid
if imax == i:
break
for lst in lists:
lst[i], lst[imax] = lst[imax], lst[i]
i = imax

def sort_multi_by_a_guest_X(a, *lists):
indices = list(range(len(a)))
indices.sort(key=lambda i: a[i])
a.sort()
for lst in lists:
for i, j in enumerate(indices):
while j < i:
j = indices[j]
lst[i], lst[j] = lst[j], lst[i]

funcs = [
sort_together_sorted_zip,
sort_together_sorted_zip_2,
sort_together_sorted_zip_X,
sort_together_decorate_in_a,
sort_together_decorate_in_first_X,
sort_multi_by_a_guest_X,
sort_together_iter_key,
sort_together_iter_key_X,
sort_together_heapsort,
sort_together_heapsort_X,
sort_together_heapsort_opti,
sort_together_heapsort_opti_X,
]

n = 100000
a0 = [random() for _ in range(n)]
b0 = [x + 1 for x in a0]
c0 = [x + 2 for x in a0]

for _ in range(3):
for func in funcs:

a, b, c = a0[:], b0[:], c0[:]
time = timeit(lambda: func(a, b, c), number=1)
assert a == sorted(a0)
assert b == sorted(b0)
assert c == sorted(c0)

a, b, c = a0[:], b0[:], c0[:]
tm.start()
func(a, b, c)
memory = tm.get_traced_memory()[1]
tm.stop()

print(f'{memory:10,} bytes {int(time * 1e3):4} ms {func.__name__}')
print()

Create a dictionary from multiple lists, one list as key, other as value

Use the zip() function to combine a list of keys with corresponding values, then pass the resulting iterator of (key, value) combinations to dict():

data = {"rows": [dict(zip(key_list, row)) for row in val_list]}

This works because zip(iter1, iter2) pairs up each element from iter1 with those of iter2, and the dict() constructor accepts an iterator of 2-value tuples:

Otherwise, the positional argument must be an iterable object. Each item in the iterable must itself be an iterable with exactly two objects. The first object of each item becomes a key in the new dictionary, and the second object the corresponding value.

In my example above I used a list comprehension to generate the whole output list in a single expression:

>>> key_list = ['key1', 'key2', 'key3']
>>> val_list = [['v0_1', 'v0_2', 'v0_3'], ['v1_1', 'v1_2', 'v1_3'], ['v2_1', 'v2_2', 'v2_3']]
>>> {"rows": [dict(zip(key_list, row)) for row in val_list]}
{'rows': [{'key1': 'v0_1', 'key2': 'v0_2', 'key3': 'v0_3'}, {'key1': 'v1_1', 'key2': 'v1_2', 'key3': 'v1_3'}, {'key1': 'v2_1', 'key2': 'v2_2', 'key3': 'v2_3'}]}
>>> from pprint import pp
>>> pp({"rows": [dict(zip(key_list, row)) for row in val_list]})
{'rows': [{'key1': 'v0_1', 'key2': 'v0_2', 'key3': 'v0_3'},
{'key1': 'v1_1', 'key2': 'v1_2', 'key3': 'v1_3'},
{'key1': 'v2_1', 'key2': 'v2_2', 'key3': 'v2_3'}]}

dict.fromkeys() is the wrong tool here as it reuses the second argument for each of the keys.

Iterate through multiple lists and a conditional if-statement

Looks like you just inverted c and n in your loop:

for v, n, c in zip(values, nodes, cells):
if v == 1:
print('Some Text', n, ' Some Text', n, 'text', c, 'Some Text')

NB. you don't need to add spaces around the chunks if using print with many parameters

output:

Some Text 123 Some Text 123 text ABC Some Text
Some Text 456 Some Text 456 text DEF Some Text
Some Text 789 Some Text 789 text GHI Some Text


Related Topics



Leave a reply



Submit