Replacing column values in a pandas DataFrame
If I understand right, you want something like this:
w['female'] = w['female'].map({'female': 1, 'male': 0})
(Here I convert the values to numbers instead of strings containing numbers. You can convert them to "1"
and "0"
, if you really want, but I'm not sure why you'd want that.)
The reason your code doesn't work is because using ['female']
on a column (the second 'female'
in your w['female']['female']
) doesn't mean "select rows where the value is 'female'". It means to select rows where the index is 'female', of which there may not be any in your DataFrame.
pandas: replace column value with keys and values in a dictionary of list values
The best is to change the logic and try to minimize the pandas steps.
You can craft a dictionary that will directly contain your ideal output:
dic2 = {v:k for k,l in dic.items() for v in l}
# {'can': 'Should', 'could': 'Should', 'shall': 'Could', 'will': 'Would'}
# or if not yet formatted:
# dic2 = {v.lower():k.capitalize() for k,l in dic.items() for v in l}
import re
regex = '|'.join(map(re.escape, dic2))
df['text'] = df['text'].str.replace(f'\b({regex})\b',
lambda m: dic2.get(m.group()),
case=False, # only if case doesn't matter
regex=True)
output (as text2 column for clarity):
text text2
0 can you open the door? Should you open the door?
1 shall you write the address? Could you write the address?
Replacing few values in a pandas dataframe column with another value
The easiest way is to use the replace
method on the column. The arguments are a list of the things you want to replace (here ['ABC', 'AB']
) and what you want to replace them with (the string 'A'
in this case):
>>> df['BrandName'].replace(['ABC', 'AB'], 'A')
0 A
1 B
2 A
3 D
4 A
This creates a new Series of values so you need to assign this new column to the correct column name:
df['BrandName'] = df['BrandName'].replace(['ABC', 'AB'], 'A')
Pandas DataFrame: replace all values in a column, based on condition
You need to select that column:
In [41]:
df.loc[df['First Season'] > 1990, 'First Season'] = 1
df
Out[41]:
Team First Season Total Games
0 Dallas Cowboys 1960 894
1 Chicago Bears 1920 1357
2 Green Bay Packers 1921 1339
3 Miami Dolphins 1966 792
4 Baltimore Ravens 1 326
5 San Franciso 49ers 1950 1003
So the syntax here is:
df.loc[<mask>(here mask is generating the labels to index) , <optional column(s)> ]
You can check the docs and also the 10 minutes to pandas which shows the semantics
EDIT
If you want to generate a boolean indicator then you can just use the boolean condition to generate a boolean Series and cast the dtype to int
this will convert True
and False
to 1
and 0
respectively:
In [43]:
df['First Season'] = (df['First Season'] > 1990).astype(int)
df
Out[43]:
Team First Season Total Games
0 Dallas Cowboys 0 894
1 Chicago Bears 0 1357
2 Green Bay Packers 0 1339
3 Miami Dolphins 0 792
4 Baltimore Ravens 1 326
5 San Franciso 49ers 0 1003
Replace Pandas DataFrame column values based on containing dictionary keys
You could use DataFrame.replace
with regex
parameter set to True and pass the mapping dictionary.
df.replace(dictionary, regex=True)
# col2
# 0 5
# 1 abc
# 2 8
This usage of df.replace
is less known. You can read more about it here.
Replace specific column values in pandas dataframe
Use df.replace
:
df = pd.DataFrame({'Tissues':['a1','x2','y3','b','c1','v2','w3'], 'M':[1,2,'a',4,'b','a',7]})
df.set_index('Tissues')
replace_values = {'a':2, 'b':3}
df['M'] = df['M'].replace(replace_values)
Output:
>>> df
Tissues M
0 a1 1
1 x2 2
2 y3 2
3 b 4
4 c1 3
5 v2 2
6 w3 7
Replace column value of Dataframe based on a condition on another Dataframe
You can also try with map
:
df_student['student_Id'] = (
df_student['student_Id'].map(df_updated_id.set_index('old_id')['new_id'])
.fillna(df_student['student_Id'])
)
print(df_student)
# Output
Name gender math score student_Id
0 John male 50 1234
1 Jay male 100 6788
2 sachin male 70 xyz
3 Geetha female 80 abcd
4 Amutha female 75 83ko
5 ganesh male 40 v432
Update
I believe the updated_id isn't unique, so I need to further pre-process the data.
In this case, maybe you could drop duplicates before considering the last value (keep='last'
) is the most recent for a same old_id
:
sr = df_updated_id.drop_duplicates('old_id', keep='last') \
.set_index('old_id')['new_id']
df_student['student_Id'] = df_student['student_Id'].map(sr) \
.fillna(df_student['student_Id']
)
Note: this is exactly what the @BENY's answer does. As he creates a dict, only the last occurrence of an old_id
is kept. However, if you want to keep the first value appears, his code doesn't work. With drop_duplicates
, you can adjust the keep
parameter.
Replace values in column based on same or closer values from another columns pandas
First we find the value if df1['Score1']
that is the closest to each value in df2['Score1']
, and put it into df2['match']
:
df2['match'] = df2['Score1'].apply(lambda s : min(df1['Score1'].values, key = lambda x: abs(x-s)))
df2
now looks like this
Score1 life match
0 3.033986 0 2.29100
1 9.103820 0 9.10382
2 9.103820 0 9.10382
3 7.350981 0 9.10382
4 1.443400 0 2.29100
5 9.103820 0 9.10382
6 -1.134486 0 -1.34432
Now we just merge on match
, drop unneeded columns and rename others
(df2[['match', 'Score1']].merge(df1, how = 'left', left_on = 'match', right_on = 'Score1', suffixes = ['','_2'])
.rename(columns = {'Avg_life':'life'})
.drop(columns = ['match', 'Score1_2'])
)
output
Score1 life
0 3.033986 432.0
1 9.103820 758.0
2 9.103820 758.0
3 7.350981 758.0
4 1.443400 432.0
5 9.103820 758.0
6 -1.134486 68000.0
Related Topics
How to Split a Dos Path into Its Components in Python
Tkinter Gui Layout Using Frames and Grid
Moving Matplotlib Legend Outside of the Axis Makes It Cutoff by the Figure Box
Automatically Import Modules When Entering the Python or Ipython Interpreter
Not Letting the Character Move Out of the Window
Convert Unix Time to Readable Date in Pandas Dataframe
How to Intercept Calls to Python's "Magic" Methods in New Style Classes
Why Does Python Print Unicode Characters When the Default Encoding Is Ascii
How to Print a Generator Expression
Sqlite/Sqlalchemy: How to Enforce Foreign Keys
What Does a . in an Import Statement in Python Mean
Why am I Getting Attributeerror: Object Has No Attribute
Turn a String into a Valid Filename
How to Check If There Are Duplicates in a Flat List