pandas: merge (join) two data frames on multiple columns
Try this
new_df = pd.merge(A_df, B_df, how='left', left_on=['A_c1','c2'], right_on = ['B_c1','c2'])
https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.merge.html
left_on : label or list, or array-like Field names to join on in left
DataFrame. Can be a vector or list of vectors of the length of the
DataFrame to use a particular vector as the join key instead of
columnsright_on : label or list, or array-like Field names to join on
in right DataFrame or vector/list of vectors per left_on docs
Pandas, merging two dataframes on multiple columns, and multiplying result
You could merge and them multiply:
merged = df1.merge(df2, on=['Name', 'Event'])
merged['ResultFactor'] = merged.Factor1 * merged.Factor2
result = merged.drop(['Factor1', 'Factor2'], axis=1)
print(result)
Output
Name Event ResultFactor
0 John A 2.4
1 John B 1.5
2 Ken A 3.0
Join two dataframes based on two columns
You can use pd.merge()
and multiple keys a
, b
and a1
, b1
using left_on
and right_on
, as follows:
import pandas as pd
import numpy as np
df1 = pd.DataFrame()
df2 = pd.DataFrame()
df3 = pd.DataFrame()
df1['a'] = [1, 2, 3]
df1['b'] = [2, 4, 6]
df1['c'] = [3, 5, 9]
df2['a1'] = [1, 2]
df2['b1'] = [4, 4]
df2['c1'] = [7, 5]
df3 = pd.merge(df1, df2, left_on=['a', 'b'], right_on=['a1', 'b1'], how='inner')
print(df3) # df3 has all columns for df1 and df2
# a b c a1 b1 c1
#0 2 4 5 2 4 5
df3 = df3.drop(df2.columns, axis=1) # removed columns of df2 as they're duplicated
df3.columns = ['a2', 'b2', 'c3'] # column names are changed as you want.
print(df3)
# a2 b2 c3
#0 2 4 5
For more information about pd.merge()
, please see: https://pandas.pydata.org/docs/reference/api/pandas.merge.html
Pandas Merging Multiple Columns at the Same Between Two Dataframes
If you already have the empty columns, you can use:
mapping = df_keyword_vol.set_index('Keyword')['Volume']
df_striking.iloc[:, 1::2] = df_striking.iloc[:, ::2].replace(mapping)
Else, if you only have the KWx
columns:
df2 = (pd.concat([df, df.replace(mapping)], axis=1)
.sort_index(axis=1)
)
output:
KW1 KW1 KW2 KW2 KW3 KW3 KW4 KW4 KW5 KW5
0 nectarine 1000 apple 600 banana 450 kiwi 1200 raspberry 400
1 apricot 500 orange 800 grapefruit 10 lemon 150 blueberry 850
2 plum 200 pear 1000 cherry 900 peach 700 berries 1000
Related Topics
Repeating Elements of a List N Times
Using Pandas .Append Within for Loop
How to Pass a Default Argument Value of an Instance Member to a Method
Why am I Getting Attributeerror: Object Has No Attribute
Pyaudio Working, But Spits Out Error Messages Each Time
Is There a Math Ncr Function in Python
Python Function Attributes - Uses and Abuses
How to Use Pip with Python 3.X Alongside Python 2.X
How to Check If Two Segments Intersect
How to Locate Element of Credit Card Number Using Selenium Python
How to Add an Image or Icon to a Button Rectangle in Pygame
Why Is It String.Join(List) Instead of List.Join(String)
Purpose of "%Matplotlib Inline"
How to Install Psycopg2 with "Pip" on Python
Making Python Loggers Output All Messages to Stdout in Addition to Log File