Merge Dataframes of Different Sizes

Pandas: combine data frames of different sizes

Just perform a left merge on 'product_id' column:

In [12]:

df.merge(df1, on='product_id', how='left')
Out[12]:
product_id count_white total_count
0 12345 4 10
1 23456 7 30
2 34567 1 90

Merge two dataframes of different sizes after a groupby function

We are using inner join to merge both dataframes, since original df has duplicates on merge keys so it was returning duplicate values. drop_duplicates() came in handy to solve that problem.

Code

df_cut.merge(df.drop_duplicates(), on=["COD","TEC","SET", "AZIM"])

Output

    COD        TEC  SET AZIM    STATE   CITY
0 ALAAD_0001 4 1 0 AL MAC
1 ALAAD_0001 4 2 120 AL MAC
2 ALAAD_0001 4 3 240 AL MAC
3 BAPID_0001 2 1 20 BA SAL
4 BAPID_0001 2 2 100 BA SAL
5 BAPID_0001 2 3 250 BA SAL
6 CEMBC_0003 4 1 90 CE FOR
7 CEMBC_0003 4 2 160 CE FOR
8 CEMBC_0003 4 3 280 CE FOR

How to merge two Pandas DataFrames of different size based on condition

Try adding an indicator column to o_type_df:

o_type_df['TypeID'] = 'O'

Then merge left on those columns:

merged = (
primary_df.merge(o_type_df,
left_on=['RCID', 'TypeID'],
right_on=['O_ID', 'TypeID'],
how='left')
)

merged:

   RCID TypeID   Data   O_ID O_Data
0 777 D Hello NaN NaN
1 777 O Hey 777.0 Foo
2 778 O Hey 778.0 Bar
3 779 D Hello NaN NaN

Or with assign:

merged = (
primary_df.merge(o_type_df.assign(TypeID='O'),
left_on=['RCID', 'TypeID'],
right_on=['O_ID', 'TypeID'],
how='left')
)

merged:

   RCID TypeID   Data   O_ID O_Data
0 777 D Hello NaN NaN
1 777 O Hey 777.0 Foo
2 778 O Hey 778.0 Bar
3 779 D Hello NaN NaN

Concatenate two dataframes of different sizes (pandas)

In this case using combine_first

df1.set_index('id').combine_first(df2.set_index('id')).reset_index()
Out[766]:
id metric1 metric2
0 a 123.0 1.0
1 b 22.0 2.0
2 c 356.0 3.0
3 d 412.0 4.0
4 f 54.0 5.0
5 g 634.0 6.0
6 h 72.0 7.0
7 j 812.0 8.0
8 k 129.0 9.0
9 l 110.0 10.0
10 m 200.0 11.0
11 q 812.0 NaN
12 w 110.0 NaN
13 z 129.0 NaN


Related Topics



Leave a reply



Submit