Use Merge() to Update a Data Frame with Values from a Second Data Frame

How to update column value of a data frame from another data frame matching 2 columns?

Here's a way to do it:

df1 = df1.join(df2.drop(columns='DEP ID').set_index(['Team ID', 'Group']), on=['Team ID', 'Group'])
df1.loc[df1.Result.notna(), 'Score'] = df1.Result
df1 = df1.drop(columns='Result')

Explanation:

modify df2 so it has Team ID, Group as its index and its only column is Result
use join to bring the new scores from df2 into a Result column in df1
use loc to update Score values for rows where Result is not null (i.e., rows for which an updated Score is available)
drop the Result column.

Full test code:

import pandas as pd
import numpy as np
df1 = pd.DataFrame({
'DEP ID':['001','001','002','002'],
'Team ID':['002','004','002','007'],
'Group':['A','A','A','A'],
'Score':[50,70,50,90]})
df2 = pd.DataFrame({
'DEP ID':['001','001','001'],
'Team ID':['002','003','004'],
'Group':['A','A','A'],
'Result':[80,60,70]})

print(df1)
print(df2)

df1 = df1.join(df2.drop(columns='DEP ID').set_index(['Team ID', 'Group']), on=['Team ID', 'Group'])
df1.loc[df1.Result.notna(), 'Score'] = df1.Result
df1 = df1.drop(columns='Result')
print(df1)

Output:

   index DEP ID Team ID Group  Score
0      0    001     002     A     80
1      1    001     004     A     70
2      2    002     002     A     80
3      3    002     007     A     90

UPDATE:

If Result column in df2 is instead named Score, as asked by OP in a comment, then the code can be adjusted slightly as follows:

df1 = df1.join(df2.drop(columns='DEP ID').set_index(['Team ID', 'Group']), on=['Team ID', 'Group'], rsuffix='_NEW')
df1.loc[df1.Score_NEW.notna(), 'Score'] = df1.Score_NEW
df1 = df1.drop(columns='Score_NEW')

How to merge 2 pandas data frames and update a column with latest value from 2 matched rows?

Something like that can do the job ...

Just make sure your updated_at column is set as datetime

>>> pd.concat([df1,df2]).sort_values('updated_at').drop_duplicates(subset=df1.columns[:-1],keep='last').sort_values('MRN')
    MRN Encounter_ID First_Name   Last_Name  Birth_Date          updated_at
1  1234         John        Doe  01/02/1999  04/12/2002 2020-12-31 06:00:00
2  2345       Joanne        Lee  04/19/2002  04/19/2002 2020-12-31 08:22:00
3  3456    Annabelle      Jones  01/02/2001  04/21/2002 2020-12-31 05:00:00

update data frame based on data from another data frame using pandas python

try this, using outer merge which gives both matching and non-matching records.

In [75]: df_m = df1.merge(df2, on="SKUCode", how='outer')                                                                                                         

In [76]: mask = df_m['Status'].isnull()                                                                                                                       

In [77]: df_m.loc[~mask, 'SKUStatus'] = df_m.loc[~mask, 'Status']

In [78]: df_m[['SKUCode', "ListPrice", "SalePrice", "SKUStatus", "CostPrice"]].fillna(0.0)

output

  SKUCode  ListPrice  SalePrice  SKUStatus  CostPrice
0       A     1798.0     1798.0        1.0      500.0
1       B     2997.0     2997.0        0.0      773.0
2       C     1798.0     1798.0        1.0      525.0
3       D      999.0      999.0        0.0      300.0
4       X        0.0        0.0        0.0        0.0
5       Y        0.0        0.0        0.0        0.0

R: add value from another data frame by finding same values in two data frames

use merge() from the base package

merge(df1, df2, by = 'Code', all.x=T, all.y=F)

How to merge two different size DataFrames in Pandas to update one dataframe depends on matching partial values in one column with another dataframe

You can use .update() after setting index on time on both data_1a and data_1b, as follows:

data_1a = data_1.set_index('time')
data_1a.update(data_2.set_index('time'))
data_out = data_1a.reset_index()

.update() modifies in place using non-NA values from another DataFrame. Aligns on indices. Thus, when you set time as index on both data_1a and data_1b, .update() aligns on matching values in column time to perform the update of data_1 by corresponding values of data_2.

Data Setup:

a = {
    'time':[1,2,3,4,5,6],
    'column_1':[2,2,2,2,2,2],
    'column_2':[3,3,3,3,3,3]   
}
b = {
    'time':[3,4,5],
    'column_1':[0,0,0],
    'column_2':[0,0,0]    
}
data_1 = pd.DataFrame(a)
data_2 = pd.DataFrame(b)

Result:

print(data_out)

   time  column_1  column_2
0     1       2.0       3.0
1     2       2.0       3.0
2     3       0.0       0.0
3     4       0.0       0.0
4     5       0.0       0.0
5     6       2.0       3.0

Python Pandas - Vlookup - Update Existing Column in First Data Frame From Second Data Frame

Use Pandas merge over df1 and df2 on columns ['key','info'], then, use column key as column name to join on and use only the keys from left dataframe how='left'. Get the resulting column (info_y) into the first dataframe.

df1['info'] = pd.merge(df1[['key','info']], df2[['key','info']], on='key', how='left')['info_y']
print(df1)

Output from df1

  dataA  dataB  key   info dataC
0   ABC    123  a1b  infoA   aaa
1   DEF    456  b57    NaN   bbb
2   GHI    789  a22  infoC   ccc