Replacing Column Values in a Pandas Dataframe

Replacing column values in a pandas DataFrame

If I understand right, you want something like this:

w['female'] = w['female'].map({'female': 1, 'male': 0})

(Here I convert the values to numbers instead of strings containing numbers. You can convert them to "1" and "0", if you really want, but I'm not sure why you'd want that.)

The reason your code doesn't work is because using ['female'] on a column (the second 'female' in your w['female']['female']) doesn't mean "select rows where the value is 'female'". It means to select rows where the index is 'female', of which there may not be any in your DataFrame.

pandas: replace column value with keys and values in a dictionary of list values

The best is to change the logic and try to minimize the pandas steps.

You can craft a dictionary that will directly contain your ideal output:

dic2 = {v:k for k,l in dic.items() for v in l}
# {'can': 'Should', 'could': 'Should', 'shall': 'Could', 'will': 'Would'}

# or if not yet formatted:
# dic2 = {v.lower():k.capitalize() for k,l in dic.items() for v in l}

import re
regex = '|'.join(map(re.escape, dic2))

df['text'] = df['text'].str.replace(f'\b({regex})\b',
                                    lambda m: dic2.get(m.group()),
                                    case=False, # only if case doesn't matter
                                    regex=True)

output (as text2 column for clarity):

                           text                         text2
0        can you open the door?     Should you open the door?
1  shall you write the address?  Could you write the address?

Replacing few values in a pandas dataframe column with another value

The easiest way is to use the replace method on the column. The arguments are a list of the things you want to replace (here ['ABC', 'AB']) and what you want to replace them with (the string 'A' in this case):

>>> df['BrandName'].replace(['ABC', 'AB'], 'A')
0    A
1    B
2    A
3    D
4    A

This creates a new Series of values so you need to assign this new column to the correct column name:

df['BrandName'] = df['BrandName'].replace(['ABC', 'AB'], 'A')

Pandas DataFrame: replace all values in a column, based on condition

You need to select that column:

In [41]:
df.loc[df['First Season'] > 1990, 'First Season'] = 1
df

Out[41]:
                 Team  First Season  Total Games
0      Dallas Cowboys          1960          894
1       Chicago Bears          1920         1357
2   Green Bay Packers          1921         1339
3      Miami Dolphins          1966          792
4    Baltimore Ravens             1          326
5  San Franciso 49ers          1950         1003

So the syntax here is:

df.loc[<mask>(here mask is generating the labels to index) , <optional column(s)> ]

You can check the docs and also the 10 minutes to pandas which shows the semantics

EDIT

If you want to generate a boolean indicator then you can just use the boolean condition to generate a boolean Series and cast the dtype to int this will convert True and False to 1 and 0 respectively:

In [43]:
df['First Season'] = (df['First Season'] > 1990).astype(int)
df

Out[43]:
                 Team  First Season  Total Games
0      Dallas Cowboys             0          894
1       Chicago Bears             0         1357
2   Green Bay Packers             0         1339
3      Miami Dolphins             0          792
4    Baltimore Ravens             1          326
5  San Franciso 49ers             0         1003

Replace Pandas DataFrame column values based on containing dictionary keys

You could use DataFrame.replace with regex parameter set to True and pass the mapping dictionary.

df.replace(dictionary, regex=True)

#   col2
# 0    5
# 1  abc
# 2    8

This usage of df.replace is less known. You can read more about it here.

Replace specific column values in pandas dataframe

Use df.replace:

df = pd.DataFrame({'Tissues':['a1','x2','y3','b','c1','v2','w3'], 'M':[1,2,'a',4,'b','a',7]})
df.set_index('Tissues')

replace_values = {'a':2, 'b':3}

df['M'] = df['M'].replace(replace_values)

Output:

>>> df
  Tissues  M
0      a1  1
1      x2  2
2      y3  2
3       b  4
4      c1  3
5      v2  2
6      w3  7

Replace column value of Dataframe based on a condition on another Dataframe

You can also try with map:

df_student['student_Id'] = (
    df_student['student_Id'].map(df_updated_id.set_index('old_id')['new_id'])
                            .fillna(df_student['student_Id'])
)
print(df_student)

# Output
     Name  gender  math score student_Id
0    John    male          50       1234
1     Jay    male         100       6788
2  sachin    male          70        xyz
3  Geetha  female          80       abcd
4  Amutha  female          75       83ko
5  ganesh    male          40       v432

Update

I believe the updated_id isn't unique, so I need to further pre-process the data.

In this case, maybe you could drop duplicates before considering the last value (keep='last') is the most recent for a same old_id:

sr = df_updated_id.drop_duplicates('old_id', keep='last') \
                  .set_index('old_id')['new_id']

df_student['student_Id'] = df_student['student_Id'].map(sr) \
                            .fillna(df_student['student_Id']
)

Note: this is exactly what the @BENY's answer does. As he creates a dict, only the last occurrence of an old_id is kept. However, if you want to keep the first value appears, his code doesn't work. With drop_duplicates, you can adjust the keep parameter.

Replace values in column based on same or closer values from another columns pandas

First we find the value if df1['Score1'] that is the closest to each value in df2['Score1'], and put it into df2['match']:

df2['match'] = df2['Score1'].apply(lambda s : min(df1['Score1'].values, key = lambda x: abs(x-s)))

df2 now looks like this


    Score1   life   match
0   3.033986    0   2.29100
1   9.103820    0   9.10382
2   9.103820    0   9.10382
3   7.350981    0   9.10382
4   1.443400    0   2.29100
5   9.103820    0   9.10382
6   -1.134486   0   -1.34432

Now we just merge on match, drop unneeded columns and rename others

(df2[['match', 'Score1']].merge(df1, how = 'left', left_on = 'match', right_on = 'Score1', suffixes = ['','_2'])
    .rename(columns = {'Avg_life':'life'})
    .drop(columns = ['match', 'Score1_2'])
)

output


    Score1      life
0   3.033986    432.0
1   9.103820    758.0
2   9.103820    758.0
3   7.350981    758.0
4   1.443400    432.0
5   9.103820    758.0
6   -1.134486   68000.0

Replacing Column Values in a Pandas Dataframe