Replace Specific Values Based on Another Dataframe

Replace column values based on another dataframe python pandas - better way?

Use the boolean mask from isin to filter the df and assign the desired row values from the rhs df:

In [27]:

df.loc[df.Name.isin(df1.Name), ['Nonprofit', 'Education']] = df1[['Nonprofit', 'Education']]
df
Out[27]:
  Name  Nonprofit  Business  Education
0    X          1         1          0
1    Y          1         1          1
2    Z          1         0          1
3    Y          1         1          1

[4 rows x 4 columns]

replace column values in one dataframe by values of another dataframe

If you set the index to the 'Group' column on the other df then you can replace using map on your original df 'Group' column:

In [36]:
df['Group'] = df['Group'].map(df1.set_index('Group')['Hotel'])
df

Out[36]:
         Date  Group  Family  Bonus
0  2011-06-09  Jamel  Laavin    456
1  2011-07-09  Frank  Grendy    679
2  2011-09-10   Luxy  Fantol    431
3  2011-11-02  Frank  Gondow    569

Replace column value of Dataframe based on a condition on another Dataframe

You can also try with map:

df_student['student_Id'] = (
    df_student['student_Id'].map(df_updated_id.set_index('old_id')['new_id'])
                            .fillna(df_student['student_Id'])
)
print(df_student)

# Output
     Name  gender  math score student_Id
0    John    male          50       1234
1     Jay    male         100       6788
2  sachin    male          70        xyz
3  Geetha  female          80       abcd
4  Amutha  female          75       83ko
5  ganesh    male          40       v432

Update

I believe the updated_id isn't unique, so I need to further pre-process the data.

In this case, maybe you could drop duplicates before considering the last value (keep='last') is the most recent for a same old_id:

sr = df_updated_id.drop_duplicates('old_id', keep='last') \
                  .set_index('old_id')['new_id']

df_student['student_Id'] = df_student['student_Id'].map(sr) \
                            .fillna(df_student['student_Id']
)

Note: this is exactly what the @BENY's answer does. As he creates a dict, only the last occurrence of an old_id is kept. However, if you want to keep the first value appears, his code doesn't work. With drop_duplicates, you can adjust the keep parameter.

Replace values in one dataframe with values from another dataframe

You can use update after replacing 0 with np.nan and setting a common index between the two dataframes.

Be wary of two things:

Use overwrite=False to only fill the null values
update modifies inplace

common_index = ['Region','Product']
df_indexed = df.replace(0,np.nan).set_index(common_index)
df2_indexed = df2.set_index(common_index)

df_indexed.update(df2_indexed,overwrite=False)

print(df_indexed.reset_index())

    Region Product       Country  Quantity   Price
0   Africa     ABC  South Africa     500.0  1200.0
1   Africa     DEF  South Africa     200.0   400.0
2   Africa     XYZ  South Africa     110.0   300.0
3   Africa     DEF       Nigeria     150.0   450.0
4   Africa     XYZ       Nigeria     200.0   750.0
5     Asia     XYZ         Japan     100.0   500.0
6     Asia     ABC         Japan     200.0   500.0
7     Asia     DEF         Japan     120.0   300.0
8     Asia     XYZ         India     250.0   600.0
9     Asia     ABC         India     100.0   400.0
10    Asia     DEF         India      40.0   220.0

replacing values in a pandas dataframe with values from another dataframe based common columns

First separate the rows where you have NaN values out into a new dataframe called df3 and drop the rows where there are NaN values from df1.

Then do a left join based on the new dataframe.

df4 = pd.merge(df3,df2,how='left',on=['types','o_period'])

After that is done, append the rows from df4 back into df1.

Another way is to combine the 2 columns you want to lookup into a single column

df1["types_o"] = df1["types_o"].astype(str) + df1["o_period"].astype(str)

df2["types_o"] = df2["types_o"].astype(str) + df2["o_period"].astype(str)

Then you can do a look up on the missing values.

df1.types_o.replace('Nan', np.NaN, inplace=True)

df1.loc[df1['s_months'].isnull(),'s_months'] = df2['types_o'].map(df1.types_o)

df1.loc[df1['incidents'].isnull(),'incidents'] = df2['types_o'].map(df1.types_o)

You didn't paste any code or examples of your data which is easily reproducible so this is the best I can do.

Replace values in one column based on part of text in another dataframe in R

This seems to be a case for fuzzy_join with regex_left_join. After the regex_left_join, coalecse the columns together so that it will return the first non-NA element per each row

library(fuzzyjoin)
library(dplyr)
regex_left_join(df1, df2, by = 'Supplier') %>% 
    transmute(Supplier = coalesce(New_Supplier, Supplier.x), Value)

-output

 Supplier Value
1       AAA   100
2       Red   200
3       Red   300
4       DDD   400
5      Blue   200
6      Blue   100
7     Green   200
8       HHH    40
9       III   150
10      JJJ    70