Google sheets - cross join / cartesian join from two separate columns
Use this:
=ARRAYFORMULA(
SPLIT(
FLATTEN(
FILTER('input 1'!A2:A, 'input 1'!A2:A <> "")
& ","
& TRANSPOSE(FILTER('input 2'!A2:A, 'input 2'!A2:A <> ""))
),
","
)
)
in cell A3
but make sure you reformat column B into date
(copy the format from column E):
Excel - Generating the Cartesian product of three sets of numbers in each row
Here is a VBA function which can handle the special case of 3 number sets:
Function CartesianProduct(nums1 As Range, nums2 As Range, nums3 As Range) As Variant
Dim n As Long 'number of products
Dim i As Long, j As Long, k As Long, r As Long
Dim products As Variant
n = nums1.Cells.Count * nums2.Cells.Count * nums3.Cells.Count
ReDim products(1 To n, 1 To 3)
For i = 1 To nums1.Cells.Count
For j = 1 To nums2.Cells.Count
For k = 1 To nums3.Cells.Count
r = r + 1 'current row
products(r, 1) = nums1.Cells(i)
products(r, 2) = nums2.Cells(j)
products(r, 3) = nums3.Cells(k)
Next k
Next j
Next i
CartesianProduct = products
End Function
This can be called from another VBA function or sub, or used directly as an array formula in the sheet:
In the above screenshot I selected the range A3:C8 (needing to determine its size ahead of time) entered the formula
=CartesianProduct(A1,B1:D1,E1:F1)
and then accept it as an array formula by entering it with Ctrl+Shift+Enter
.
Once you get beyond three sets, things get a bit tricky since you can't hardwire in the necessary levels for a looping approach and would instead probably use a recursive approach, something along the lines of this answer: https://stackoverflow.com/a/31622856/4996248
cartesian product in pandas
In recent versions of Pandas (>= 1.2) this is built into merge
so you can do:
from pandas import DataFrame
df1 = DataFrame({'col1':[1,2],'col2':[3,4]})
df2 = DataFrame({'col3':[5,6]})
df1.merge(df2, how='cross')
This is equivalent to the previous pandas < 1.2 answer but is easier to read.
For pandas < 1.2:
If you have a key that is repeated for each row, then you can produce a cartesian product using merge (like you would in SQL).
from pandas import DataFrame, merge
df1 = DataFrame({'key':[1,1], 'col1':[1,2],'col2':[3,4]})
df2 = DataFrame({'key':[1,1], 'col3':[5,6]})
merge(df1, df2,on='key')[['col1', 'col2', 'col3']]
Output:
col1 col2 col3
0 1 3 5
1 1 3 6
2 2 4 5
3 2 4 6
See here for the documentation: http://pandas.pydata.org/pandas-docs/stable/merging.html
Create the cartesian product (cross join) of two csv files in python
I don't know of any out-of-the-box solution, so I made this:
import csv
from itertools import product
def main():
with open('file1.csv', 'rb') as f1, open('file2.csv', 'rb') as f2:
reader1 = csv.reader(f1, dialect=csv.excel_tab)
reader2 = csv.reader(f2, dialect=csv.excel_tab)
# Step 1: Read and write the headers separately.
header1, header2 = next(reader1), next(reader2)
with open('output.csv', 'wb') as out:
writer = csv.writer(out, dialect=csv.excel_tab)
writer.writerow(header1 + header2)
# Step 2: Write the product of the rest of the rows.
writer.writerows(
row1 + row2 for row1, row2 in product(reader1, reader2))
main()
With files:
file1.csv
hour Elevation Azimuth
06:29:00 -0.833 67.72
07:00:00 6.28 68.75
file2.csv
ID SURFACES
1 GROUND
2 ROOF
you get the following output.csv
:
hour Elevation Azimuth ID SURFACES
06:29:00 -0.833 67.72 1 GROUND
06:29:00 -0.833 67.72 2 ROOF
07:00:00 6.28 68.75 1 GROUND
07:00:00 6.28 68.75 2 ROOF
Generate all possible combinations for Columns(cross join or Cartesian product)
Update 201810
Original formula crashes for a big dataset. I described a way to make cross-join with any size of data here.
Try formula:
=ArrayFormula(transpose(split(rept(concatenate(A2:A&char(9)),counta(B2:B)),char(9)))
&" "&transpose(split(concatenate(rept(B2:B&char(9),counta(A2:A))),char(9))))
The result:
car red
train red
car yellow
train yellow
car blue
train blue
You may use it again to add another list:
The formula is in cells C2
and E2
,
C2
is:
=ArrayFormula(transpose(split(rept(concatenate(A2:A&char(9)),counta(B2:B)),char(9)))&" "&transpose(split(concatenate(rept(B2:B&char(9),counta(A2:A))),char(9))) )
and E2
is:
=ArrayFormula(transpose(split(rept(concatenate(C2:C&char(9)),counta(D2:D)),char(9)))&" "&transpose(split(concatenate(rept(D2:D&char(9),counta(C2:C))),char(9))) )
Related Topics
Orm or Something to Handle SQL Tables with an Order Column Efficiently
What SQL Query or View Will Show "Dynamic Columns"
Adodataset Deleting from Joined Table
Use Soundex() Word by Word on SQL Server
How to Dynamically Create Columns in SQL Select Statement
How to List All Stored Procedures in Informix
Convert Varchar to Datetime in SQL Which Is Having Millisec
What Is The Purpose (Or Use Case) for an Outer Join in Sql
Creating a Db Table Null Best Practices
How to Delete Last Record(On Condition) from a Table in MySQL
Issue of Multiple SQL Notifications in ASP.NET Web Application on Page Refresh
Group by Week, How to Get Empty Weeks
Create a New Db User in SQL Server 2005