Combine Two Data Frames with All Posible Combinations

combine two data frames with all posible combinations

With this particular example I think you can just use the merge function. As a standard its arguments all.x and all.y are set to TRUE, so it automatically creates all combinations since the dataframes do not have any variables or values in common.

df <-data.frame(a=letters[1:10] )
df1<-data.frame(one=1:10)

dfcomb <- merge(df,df1)
dim(dfcomb) 
[1] 100   2 #gives 100 rows and 2 columns

Merge DataFrames with the all combinations of Pks

It looks like you want the cartesian product of both dataframes. On approach would be to add a key which is repeated for all values in both dataframes, and merge them using this key.

The reason behind this is that as stated in the documentation, when key combinations appear more than once in both dataframes pd.merge does the cartesian product:

It is worth spending some time understanding the result of the many-to-many join case. In SQL / standard relational algebra, if a key combination appears more than once in both tables, the resulting table will have the Cartesian product of the associated data.

Note that all types of merge will yield the same result in this case, here left is chosen simply so that results are sorted with the same order of appearance as in df1:

cp = df1.assign(key=0).merge(df2.assign(key=0), how='left', on = 'key')
cp[['USER_ID_hash','COUPON_ID_hash']]

       USER_ID_hash         COUPON_ID_hash
0        1118263  f4ecdf6a1ccd5c4ae83c7a83a9f68454
1        1118263  34a80aff2d963d28013110789cfdfdb9
2        1118263  f0ab02ced0e91e097abc1b3c89a9d9b0
3        1118263  e97a601bc242db11579f7eada35205fd
4        1118263  771941be288930b649c7bb214efe2b07
5        1118263  2faa0e168770fb01a21f3ddf1b23aeab
6        1118263  c340a7ab46f19c1ba3b57b513f16bead
7        1118263  d84eb4dfc6d0b7bf0e8c50c06d14e4fe
8        1118263  88419bf4acc9f7e5c3529fc3bde29da6
9        1118263  530aa1bf00c00f02255eeea81559b932
10       2669790  f4ecdf6a1ccd5c4ae83c7a83a9f68454
11       2669790  34a80aff2d963d28013110789cfdfdb9
12       2669790  f0ab02ced0e91e097abc1b3c89a9d9b0

Python: combine two data frames with all combinations of index

You can create a merging index with:

df['temp'] = 1

And then merge the dataframe to itself with:

merged_df = df.merge(df, on='temp', suffixes=('A', 'B')).drop('temp', axis=1)

If you don't want the combinations of the same id_number, do finally:

merged_df = merged_df[merged_df['id_numberA'] != merged_df['id_numberB']]

And if you don't want duplicated mixes of id_numberA and id_numberB, do finally instead:

merged_df = merged_df[merged_df['id_numberA'] < merged_df['id_numberB']]

How to concatenate combinations of rows from two different dataframes?

Use itertools.product():

import itertools
pd.DataFrame(list(itertools.product(df1.A,df2.B)),columns=['A','B'])

Create combination of two pandas dataframes in two dimensions

Update pandas 1.2.0+

df1.merge(df2, how='cross')

Set a common key between the two dataframes and use pd.merge:

df1['key'] = 1
df2['key'] = 1

Merge and drop key column:

df3 = pd.merge(df1,df2,on='key').drop('key',axis=1)
df3

Output:

   df1_id  other_data_1  other_data_2  df2_id  other_data_3  other_data_4
0       1             0             1       1             0             1
1       1             0             1       3             2             2
2       2             1             5       1             0             1
3       2             1             5       3             2             2

Merge two data frames with all combinations

You can first reshape the two data frames (using melt from reshape2), then merge them (using full_join from dplyr) and reshape that back into the original format:

library(dplyr)
library(reshape2)

df1.b <- melt(df1)
df2.b <- melt(df2)

df3 <- full_join(df1.b, df2.b)

df3 <- na.omit(df3)

df3.b <- dcast(df3, nr + Color + Part ~ variable)

    nr    Color  Part mach1 mach2 mach3 mach4
1  000    Brown  Tree    NA    NA    NA     1
2  000     Cyan  Tree    NA    NA     1    NA
3  000      Red   Car    NA     1    NA    NA
4  000      Red  Tree     1    NA    NA    NA
5  001     Blue  Flag     1     1     1     1
6  002    Green  Road     1     1    NA    NA
7  002    Green  Tree    NA    NA    NA     1
8  002      Red  Tree    NA    NA     1    NA
9  003 DeepBlue House     1     1    NA    NA
10 003 DeepBlue  Road    NA    NA     1     1
11 004     Cyan Plane     1    NA    NA    NA
12 004     Grey Plane    NA    NA     1     1
13 004   Orange Plane    NA     1    NA    NA
14 005  Magenta   Car     1    NA    NA    NA
15 005  Magenta House    NA     1     1     1

Combining two dataframes together to create a long dataframe with all combinations

It's called a cross join (or cartesian product) and can do it easy enough in base R

merge(datA,datB,all=TRUE)

All combinations of two dataframes

Assuming you start with dataframes df1 and df2, first melt df2 (wide-to-long), convert your percentages to numeric, then merge with df1:

df2 = pd.melt(df2, id_vars='Week', value_vars=['Product1', 'Product2', 'Product3'])
df2['value'] = pd.to_numeric(df2['value'].str[:-1])

res = df1.merge(df2)\
         .eval('Value = Value * value / 100')\
         .drop('value', 1)

print(res)

    id Country Channel     Week  Value  variable
0    0      ES   Train  2018-48  250.0  Product1
1    0      ES   Train  2018-48  500.0  Product2
2    0      ES   Train  2018-48  250.0  Product3
3    1      ES   Train  2018-49  250.0  Product1
4    1      ES   Train  2018-49  500.0  Product2
5    1      ES   Train  2018-49  250.0  Product3
6    2      ES   Train  2018-50  250.0  Product1
7    2      ES   Train  2018-50  500.0  Product2
8    2      ES   Train  2018-50  250.0  Product3
9    3      ES   Train  2018-51  250.0  Product1
10   3      ES   Train  2018-51  500.0  Product2
11   3      ES   Train  2018-51  250.0  Product3
12   4      ES   Train  2018-52  250.0  Product1
13   4      ES   Train  2018-52  500.0  Product2
14   4      ES   Train  2018-52  250.0  Product3

Merge all possible combinations of multiple data frames

You can do:

# Put all frames in a list
d <- list(frame1, frame2, frame3)
# get all merge-combinations
gr <- expand.grid(1:length(d), 1:length(d))

# function to merge and get the sum diff:
foo <- function(i, x, gr){
  tmp <- merge(x[[gr[i, 1]]], x[[gr[i, 2]]], by=c("a", "b"))
  sum(abs(tmp$c.x - tmp$c.y))
}

# result matrix
matrix(sapply(1:nrow(gr), foo, d, gr), length(d), length(d),  byrow = T)
      [,1] [,2] [,3]
[1,]    0    0    0
[2,]    0    0    0
[3,]    0    0    0

# The scores are set as followed:
matrix(apply(gr, 1, paste, collapse="_"), 3, 3,  byrow = T)
      [,1]  [,2]  [,3] 
[1,] "1_1" "2_1" "3_1"
[2,] "1_2" "2_2" "3_2"
[3,] "1_3" "2_3" "3_3"


# alternative using apply:

# function to merge and get the sum diff:
foo <- function(y, x){
  tmp <- merge(x[[ y[1] ]], x[[ y[2] ]], by=c("a", "b"))
  sum(abs(tmp$c.x - tmp$c.y))
}
# result matrix
matrix(apply(gr, 1, foo, d), length(d), length(d),  byrow = T)

Combine Two Data Frames with All Posible Combinations