Merge lists that share common elements
You can see your list as a notation for a Graph, ie ['a','b','c']
is a graph with 3 nodes connected to each other. The problem you are trying to solve is finding connected components in this graph.
You can use NetworkX for this, which has the advantage that it's pretty much guaranteed to be correct:
l = [['a','b','c'],['b','d','e'],['k'],['o','p'],['e','f'],['p','a'],['d','g']]
import networkx
from networkx.algorithms.components.connected import connected_components
def to_graph(l):
G = networkx.Graph()
for part in l:
# each sublist is a bunch of nodes
G.add_nodes_from(part)
# it also imlies a number of edges:
G.add_edges_from(to_edges(part))
return G
def to_edges(l):
"""
treat `l` as a Graph and returns it's edges
to_edges(['a','b','c','d']) -> [(a,b), (b,c),(c,d)]
"""
it = iter(l)
last = next(it)
for current in it:
yield last, current
last = current
G = to_graph(l)
print connected_components(G)
# prints [['a', 'c', 'b', 'e', 'd', 'g', 'f', 'o', 'p'], ['k']]
To solve this efficiently yourself you have to convert the list into something graph-ish anyways, so you might as well use networkX from the start.
How to merge lists with common elements in a list of lists?
You need to use recursion:
def group(d, _start, _c = [], _seen = [], _used=[]):
r = [i for i in d if any(c in _start for c in i) and i not in _seen and i not in _used]
if not r:
yield set(_c)
for i in d:
if i != _start and i not in _used:
yield from group(d, i, _c=[], _seen=[], _used=_used+[i, *r])
else:
yield from group(d, _start, _c=_c+[i for b in r for i in b], _seen=_seen+r, _used=_used+r)
data = [[1, 7, 3], [1, 7, 5], [2, 0, 4], [2, 0, 6], [3, 7, 1], [3, 7, 5], [4, 0, 2], [4, 0, 6], [5, 7, 1], [5, 7, 3], [6, 0, 2], [6, 0, 4]]
result = list(map(list, {tuple(i) for i in group(data, data[0], _seen=[data[0]]) if i}))
Output:
[[0, 2, 4, 6], [1, 3, 5, 7]]
How do I merge lists with common elements, where these elements themselves are lists/tuples?
Here is a solution that uses O(n) additional space which tracks all sub-lists that are already added using a hash table (thus, all lookups would just be O(1)).
added = set()
merged = []
for item in data:
filtered = list(filter(lambda value: value not in added, map(tuple, item)))
if filtered:
added.update(filtered)
merged.append(list(map(list, filtered)))
print(merged)
Output
[[[0, 1], [2, 3], [4, 5]], [[6, 7], [8, 9], [10, 11]], [[12, 13], [14, 15], [16, 17], [18, 19], [20, 21]]]
Update
The solution above only prevents duplicate list items from occurring again on the next list. Here, we wouldn't only prevent them from occurring again but also merge them on existing ones. As a result, the time complexity of this algorithm is somewhat O(n^2).
merged = []
for item in data:
item_set = set(map(tuple, item))
for result in merged:
if result & item_set:
result.update(item_set)
break
else:
merged.append(item_set)
merged = [sorted(map(list, item)) for item in merged]
print(merged)
Merge List of lists where sublists have common elements
The following code should solve your problem:
def merge_subs(lst_of_lsts):
res = []
for row in lst_of_lsts:
for i, resrow in enumerate(res):
if row[0]==resrow[0]:
res[i] += row[1:]
break
else:
res.append(row)
return res
Note that the else
belongs to the inner for
and is executed if the loop is exited without hitting the break.
merging sets which have even one element in common R
Here is another approach which constructs a matrix showing which elements of the list intersect with each other and uses the igraph
package to deduce the groups:
library(igraph)
## Construct the matrix
m = sapply(lista,function(x) sapply(lista,function(y) length(intersect(x,y))>0))
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] TRUE TRUE FALSE FALSE TRUE FALSE
[2,] TRUE TRUE FALSE FALSE FALSE FALSE
[3,] FALSE FALSE TRUE TRUE FALSE FALSE
[4,] FALSE FALSE TRUE TRUE FALSE FALSE
[5,] TRUE FALSE FALSE FALSE TRUE TRUE
[6,] FALSE FALSE FALSE FALSE TRUE TRUE
## Determine the groups of the graph constructed from m
groups = groups(components(graph_from_adjacency_matrix(m)))
$`1`
[1] 1 2 5 6
$`2`
[1] 3 4
## Get the unique elements of each group
res = lapply(groups,function(x) sort(unique(unlist(lista[x]))))
$`1`
[1] 1 2 4 6 7 8 9 10 11 12 19 32 34 35 36 37 38 39 40
$`2`
[1] 13 14 16 20 21 23 25 26 27 28 29 30 31
How to merge multiple lists into 1 but only those elements that were in all of the initial lists?
Given some lists xs1
, ..., xs5
:
xss = [xs1, xs2, xs3, xs4, xs5]
sets = [set(xs) for xs in xss]
merged = set.intersection(*sets)
This has the property that merged
may be in any order.
Related Topics
Conda Reports Packagesnotfounderror: Python=3.1 for Reticulate Environment
Placing Custom Images in a Plot Window--As Custom Data Markers or to Annotate Those Markers
Why Are Scripting Languages (E.G. Perl, Python, and Ruby) Not Suitable as Shell Languages
Parallel Processing from a Command Queue on Linux (Bash, Python, Ruby... Whatever)
Name This Python/Ruby Language Construct (Using Array Values to Satisfy Function Parameters)
Numpy Selecting Specific Column Index Per Row by Using a List of Indexes
How to Make a Cross-Module Variable
Priority of the Logical Operators Not, And, or in Python
Str' Object Does Not Support Item Assignment
Sharing a Result Queue Among Several Processes
How to Create a Reference to a Variable in Python
Why Is a List Comprehension So Much Faster Than Appending to a List
Show Default Value for Editing on Python Input Possible
R and Python in One Jupyter Notebook
R Expand.Grid() Function in Python
Get a List from Pandas Dataframe Column Headers
How to Get List of Methods in a Python Class
How to Convert the Background Color of Image to Match the Color of Pygame Window