replace column values in one dataframe by values of another dataframe
If you set the index to the 'Group' column on the other df then you can replace using map
on your original df 'Group' column:
In [36]:
df['Group'] = df['Group'].map(df1.set_index('Group')['Hotel'])
df
Out[36]:
Date Group Family Bonus
0 2011-06-09 Jamel Laavin 456
1 2011-07-09 Frank Grendy 679
2 2011-09-10 Luxy Fantol 431
3 2011-11-02 Frank Gondow 569
Replace values in one dataframe with values from another dataframe
You can use update
after replacing 0
with np.nan
and setting a common index
between the two dataframes.
Be wary of two things:
- Use
overwrite=False
to only fill the null values update
modifiesinplace
common_index = ['Region','Product']
df_indexed = df.replace(0,np.nan).set_index(common_index)
df2_indexed = df2.set_index(common_index)
df_indexed.update(df2_indexed,overwrite=False)
print(df_indexed.reset_index())
Region Product Country Quantity Price
0 Africa ABC South Africa 500.0 1200.0
1 Africa DEF South Africa 200.0 400.0
2 Africa XYZ South Africa 110.0 300.0
3 Africa DEF Nigeria 150.0 450.0
4 Africa XYZ Nigeria 200.0 750.0
5 Asia XYZ Japan 100.0 500.0
6 Asia ABC Japan 200.0 500.0
7 Asia DEF Japan 120.0 300.0
8 Asia XYZ India 250.0 600.0
9 Asia ABC India 100.0 400.0
10 Asia DEF India 40.0 220.0
Replace column values based on another dataframe python pandas - better way?
Use the boolean mask from isin
to filter the df and assign the desired row values from the rhs df:
In [27]:
df.loc[df.Name.isin(df1.Name), ['Nonprofit', 'Education']] = df1[['Nonprofit', 'Education']]
df
Out[27]:
Name Nonprofit Business Education
0 X 1 1 0
1 Y 1 1 1
2 Z 1 0 1
3 Y 1 1 1
[4 rows x 4 columns]
replacing values in a pandas dataframe with values from another dataframe based common columns
First separate the rows where you have NaN values out into a new dataframe called df3 and drop the rows where there are NaN values from df1.
Then do a left join based on the new dataframe.
df4 = pd.merge(df3,df2,how='left',on=['types','o_period'])
After that is done, append the rows from df4 back into df1.
Another way is to combine the 2 columns you want to lookup into a single column
df1["types_o"] = df1["types_o"].astype(str) + df1["o_period"].astype(str)
df2["types_o"] = df2["types_o"].astype(str) + df2["o_period"].astype(str)
Then you can do a look up on the missing values.
df1.types_o.replace('Nan', np.NaN, inplace=True)
df1.loc[df1['s_months'].isnull(),'s_months'] = df2['types_o'].map(df1.types_o)
df1.loc[df1['incidents'].isnull(),'incidents'] = df2['types_o'].map(df1.types_o)
You didn't paste any code or examples of your data which is easily reproducible so this is the best I can do.
Replace column value of Dataframe based on a condition on another Dataframe
You can also try with map
:
df_student['student_Id'] = (
df_student['student_Id'].map(df_updated_id.set_index('old_id')['new_id'])
.fillna(df_student['student_Id'])
)
print(df_student)
# Output
Name gender math score student_Id
0 John male 50 1234
1 Jay male 100 6788
2 sachin male 70 xyz
3 Geetha female 80 abcd
4 Amutha female 75 83ko
5 ganesh male 40 v432
Update
I believe the updated_id isn't unique, so I need to further pre-process the data.
In this case, maybe you could drop duplicates before considering the last value (keep='last'
) is the most recent for a same old_id
:
sr = df_updated_id.drop_duplicates('old_id', keep='last') \
.set_index('old_id')['new_id']
df_student['student_Id'] = df_student['student_Id'].map(sr) \
.fillna(df_student['student_Id']
)
Note: this is exactly what the @BENY's answer does. As he creates a dict, only the last occurrence of an old_id
is kept. However, if you want to keep the first value appears, his code doesn't work. With drop_duplicates
, you can adjust the keep
parameter.
Replace a column value of one dataframe with a column value of another dataframe if the absolute difference between them is the lowest
Use array broadcasting to compute the differences and set the values with .loc
and idxmin
:
other = df2["Var2"].to_numpy()
differences = pd.DataFrame(df1['Var1'].to_numpy()[:, None] - other).abs()
df1["Var1"].loc[differences.idxmin()] = other
>>> df1
Var1
0 105.129000
1 52.788500
2 10.992200
3 22.844300
4 73.588000
5 97.582803
6 91.947400
7 41.648500
8 68.440200
9 84.956329
Replace matching values from one dataframe with index value from another dataframe
TRY:
df1['fruit'] = df1.fruit.map(dict(df2[['fruit','id']].values))
How to conditionally replace Pandas dataframe column values from another dataframe
Create dict by zipping the df2 columns.
Use map to transfer values over to df1. Code below
df1['col2']=df1['col1'].map(dict(zip(df2['col1'],df2['col2'])))
Related Topics
Python Mixed Integer Linear Programming
Builtin Function Not Working with Spyder
Getting Values from Object Oriented Tkinter
Compile Main Python Program Using Cython
How to Convert 24 Hour Time to 12 Hour Time
How to Crop an Image with Pygame
Writing Utf-8 String to MySQL with Python
Operationalerror: Database Is Locked
Python Datetime to String Without Microsecond Component
Class Inheritance in Python 3.7 Dataclasses
Can You Give a Django App a Verbose Name for Use Throughout the Admin
Is There Any Built-In Way to Get the Length of an Iterable in Python
Differencebetween 'Transform' and 'Fit_Transform' in Sklearn
Hitting Maximum Recursion Depth Using Pickle/Cpickle
Why Is the Value of _Name_ Changing After Assignment to Sys.Modules[_Name_]