Combine Lists with Common Elements

Merge lists that share common elements

You can see your list as a notation for a Graph, ie ['a','b','c'] is a graph with 3 nodes connected to each other. The problem you are trying to solve is finding connected components in this graph.

You can use NetworkX for this, which has the advantage that it's pretty much guaranteed to be correct:

l = [['a','b','c'],['b','d','e'],['k'],['o','p'],['e','f'],['p','a'],['d','g']]

import networkx
from networkx.algorithms.components.connected import connected_components

def to_graph(l):
G = networkx.Graph()
for part in l:
# each sublist is a bunch of nodes
G.add_nodes_from(part)
# it also imlies a number of edges:
G.add_edges_from(to_edges(part))
return G

def to_edges(l):
"""
treat `l` as a Graph and returns it's edges
to_edges(['a','b','c','d']) -> [(a,b), (b,c),(c,d)]
"""
it = iter(l)
last = next(it)

for current in it:
yield last, current
last = current

G = to_graph(l)
print connected_components(G)
# prints [['a', 'c', 'b', 'e', 'd', 'g', 'f', 'o', 'p'], ['k']]

To solve this efficiently yourself you have to convert the list into something graph-ish anyways, so you might as well use networkX from the start.

How to merge lists with common elements in a list of lists?

You need to use recursion:

def group(d, _start, _c = [], _seen = [], _used=[]):
r = [i for i in d if any(c in _start for c in i) and i not in _seen and i not in _used]
if not r:
yield set(_c)
for i in d:
if i != _start and i not in _used:
yield from group(d, i, _c=[], _seen=[], _used=_used+[i, *r])
else:
yield from group(d, _start, _c=_c+[i for b in r for i in b], _seen=_seen+r, _used=_used+r)

data = [[1, 7, 3], [1, 7, 5], [2, 0, 4], [2, 0, 6], [3, 7, 1], [3, 7, 5], [4, 0, 2], [4, 0, 6], [5, 7, 1], [5, 7, 3], [6, 0, 2], [6, 0, 4]]
result = list(map(list, {tuple(i) for i in group(data, data[0], _seen=[data[0]]) if i}))

Output:

[[0, 2, 4, 6], [1, 3, 5, 7]]

Merge List of lists where sublists have common elements

The following code should solve your problem:

def merge_subs(lst_of_lsts):
res = []
for row in lst_of_lsts:
for i, resrow in enumerate(res):
if row[0]==resrow[0]:
res[i] += row[1:]
break
else:
res.append(row)
return res

Note that the elsebelongs to the inner for and is executed if the loop is exited without hitting the break.

Merging sets with common elements?

Here are what I tried. Firstly, I did something simple but wrong

result = set()
while input:
r = set(input.pop())
for rr in input.copy():
if rr & r:
input.remove(rr)
r.update(rr)
result.add(frozenset(r))

The problem with this code is that the merging may not be complete, depending on the order of the input.pop(). Suppose input={frozenset([0, 1]), frozenset([2,3]), frozenset([1,2])} and the three elements get popped out and looped over in this assignment order, then results={frozenset([0,1,2]), frozenset([2,3])}.

Then I implemented the answer in this post. It first builds an adjacency list of a new graph, each node corresponds to one frozenset element of the input. An edge exists between two nodes (frozensets) if they share common elements. Then depth first graph traversal is used to find the connected components of this new graph.

result, visited = set(), set()
components = collections.defaultdict(list)
adj_list = collections.defaultdict(list)

def dft(node, key):
visited.add(node)
components[key].append(node)
for neighbor in adj_list[node]:
if neighbor not in visited:
dft(neighbor, key)

for r1, r2 in itertools.combinations_with_replacement(input, 2):
if r1 & r2:
adj_list[r1].append(r2)
adj_list[r2].append(r1)
for node in adj_list:
if node not in visited:
dft(node, node)

for node, neighbors in components.iteritems():
result.add(node.union(*neighbors))

How to merge multiple lists into 1 but only those elements that were in all of the initial lists?

Given some lists xs1, ..., xs5:

xss = [xs1, xs2, xs3, xs4, xs5]
sets = [set(xs) for xs in xss]
merged = set.intersection(*sets)

This has the property that merged may be in any order.



Related Topics



Leave a reply



Submit