What's a zip join? Have you ever heard of that, or a pairwise join?
Zip joins are only meaningful when talking about ordered sets. Instead of joining based on the value of a column, you are joining based on the row number.
Table1
[λ] [color]
400 violet
415 indigo
475 blue
510 green
570 yellow
590 orange
650 red
Table2
[flame] [element]
green boron
yellow sodium
white magnesium
red calcium
blue indium
Table1 INNER JOIN Table2 ON [color] = [flame] : only matching rows
[λ] [color] [flame] [element]
475 blue blue indium
510 green green boron
570 yellow yellow sodium
650 red red calcium
Table1 OUTER JOIN Table2 ON [color] = [flame] : all rows, matched where possible
[λ] [color] [flame] [element]
400 violet NULL NULL
415 indigo NULL NULL
475 blue blue indium
510 green green boron
570 yellow yellow sodium
590 orange NULL NULL
650 red red calcium
NULL NULL white magnesium
Table1 "zip joined" to Table2 : all rows, regardless of match
[λ] [color] [flame] [element]
400 violet green boron
415 indigo yellow sodium
475 blue white magnesium
510 green red calcium
570 yellow blue indium
590 orange NULL NULL
650 red NULL NULL
Zip joins are combining the data like a zipper, pairing the first row from one table with the first row from the other, second paired with second, etc. It's not actually looking at that data. They can be generated very quickly, but they won't mean anything unless there is some meaningful order already present in your data or if you just want to generate random pairings
python3 join lists that have same value in list of lists
This can be seen as a graph problem in which you merge subgraphs and need to find the connected components.
Here is your graph:
networkx
Using networkx
you can do:
import networkx as nx
from itertools import chain, pairwise
# python < 3.10 use this recipe for pairwise instead
# from itertools import tee
# def pairwise(iterable):
# a, b = tee(iterable)
# next(b, None)
# return zip(a, b)
G = nx.Graph()
G = nx.from_edgelist(chain.from_iterable(pairwise(e) for e in l))
G.add_nodes_from(set.union(*map(set, l))) # adding single items
list(nx.connected_components(G))
output:
[{1, 2, 3, 4}, {5, 6, 7, 8, 9}]
python
Now, you can use pure python to perform the same thing, finding the connected components and merging them.
An example code is nicely described in this post (archive.org link for long term).
In summary, the first step is building the list of neighbors, then a recursive function is used to join the neighbors of neighbors keeping track of the already seen ones.
from collections import defaultdict
#merge function to merge all sublist having common elements.
def merge_common(lists):
neigh = defaultdict(set)
visited = set()
for each in lists:
for item in each:
neigh[item].update(each)
def comp(node, neigh = neigh, visited = visited, vis = visited.add):
nodes = set([node])
next_node = nodes.pop
while nodes:
node = next_node()
vis(node)
nodes |= neigh[node] - visited
yield node
for node in neigh:
if node not in visited:
yield sorted(comp(node))
example:
merge_common(l)
# [[1, 2, 3, 4], [5, 6, 7, 8, 9]]
Python 3: pairwise iterating through list
Use zip(*[iter(it)] * 2)
, as seen in this answer.
it = [1,2,3,4,5,6]
for x, y in zip(*[iter(it)] * 2):
print(x, y)
Return Alternating Letters With the Same Length From two Strings
str.join
with zip
is possible, since zip
only iterates pairwise up to the shortest iterable. You can combine with itertools.chain
to flatten an iterable of tuples:
from itertools import chain
def one_each(st, dum):
return ''.join(chain.from_iterable(zip(st, dum)))
x = one_each("bofa", "BOFAAAA")
print(x)
bBoOfFaA
Related Topics
SQL Server Race Condition Question
How to Find SQL Language Specification
Using Guid in SQLite Select Where Guid Is Stored in the SQLite Db as Binaries
Sqlite Get Name of Attached Databases
What Is the Equivalent of Xml Path and Stuff in Linq Lambda Expression (Group_Concat/String_Agg)
How to Use Array_Agg() for Varchar[]
Generate a Sequential Number (Per Group) When Adding a Row to an Access Table
Ssrs Grey Out Parameter Based on Result from Other Parameter
How to Call a Stored Procedure from Another Stored Procedure
Xml Query() Works, Value() Requires Singleton Found Xdt:Untypedatomic
Powershell SQL Select Output to Variable
Connect by or Hierarchical Queries in Rdbms Other Than Oracle