How to Replace Nan Values Where the Other Columns Meet a Certain Criteria

How to replace NaN values where the other columns meet a certain criteria?

I am trying to replace the unknown age of male, 1st class passengers
with the average age of male, 1st class passengers.

You can split the problem into 2 steps. First calculate the average age of male, 1st class passengers:

mask = (df['Pclass'] == 1) & (df['Sex'] == 'male')
avg_filler = df.loc[mask, 'Age'].mean()

Then update values satisfying your criteria:

df.loc[df['Age'].isnull() & mask, 'Age'] = avg_filler

How to replace nan values of a column based on certain values of other column

You can make a dictionary here:

rep_nan = {
    'bachelor': 'tacher',
    'blabla': 'blabla',
    'high school': 'actor'
}

Then we can replace the nan values with:

df.loc[df['col2'].isnull(), 'col2'] = df[df['col2'].isnull()]['col1'].replace(rep_nan)

For example:

>>> df
          col1   col2
0     bachelor   None
1     bachelor  clown
2       blabla   None
3  high school   None
>>> df.loc[df['col2'].isnull(), 'col2'] = df[df['col2'].isnull()]['col1'].replace(rep_nan)
>>> df
          col1    col2
0     bachelor  tacher
1     bachelor   clown
2       blabla  blabla
3  high school   actor

Replace NaN Values with the Means of other Cols based on Condition

You could implement the function like this:

def replace_missing_with_conditional_mean(df, condition_cols, cols):
    s = df.groupby(condition_cols)[cols].transform('mean')
    return df.fillna(s.to_dict('series'))


res = replace_missing_with_conditional_mean(df, ['Col1', 'Col2'], ['Col3'])
print(res)

Output

  Col1 Col2  Col3
0    A    c   1.0
1    A    c   3.0
2    B    c   5.0
3    A    d   6.0
4    A    c   2.0

Python Pandas replace NaN in one column with value from corresponding row of second column

Assuming your DataFrame is in df:

df.Temp_Rating.fillna(df.Farheit, inplace=True)
del df['Farheit']
df.columns = 'File heat Observations'.split()

First replace any NaN values with the corresponding value of df.Farheit. Delete the 'Farheit' column. Then rename the columns. Here's the resulting DataFrame:

resulting DataFrame

Filling nan if row in another column meets condition

Assumung the spaces are np.nan , if not, you can replace then by df=df.replace('',np.nan) you can use numpy.where() for faster results:

df.country=np.where(df.state.isin(states),df.country.fillna('USA'),df.country)
print(df)

  state country
0    AL     USA
1    WI     USA
2    FL     USA
3    NJ     USA
4    BM     NaN

How can I replace NaN values based on multiple conditions?

You could create a replacement_value: index_mask mapping using a dictionary and then iterate over it, like so:

>>> masks = {1: (df['B'] >= 10) & (df['B'] < 20) & df['C'].isnull(), 2: (df['B'] >= 20) & (df['B'] < 30) & df['C'].isnull(), 3: (df['B'] >= 30) & df['C'].isnull()}
>>> masks
{1: 0    False
1    False
2     True
3    False
dtype: bool, 2: 0    False
1     True
2    False
3    False
dtype: bool, 3: 0    False
1    False
2    False
3    False
dtype: bool}
>>> for replacement_value, mask in masks.items():
...     df.loc[mask, 'C'] = replacement_value
... 
>>> df
    A   B    C
0  10  12  1.0
1  12  24  2.0
2  30  16  1.0
3  21  31  4.0

Note that I made the between conditions exclusive on the upper bound, i.e. to replace with 1 the value for df['B'] needs to be in the range [10, 20)]; to replace with 2 [20, 30), etc., because otherwise you have overlapping bounds.

pandas replace np.nan based on multiple conditions

Try rewriting your np.where statement:

df['is_less'] = np.where( (df['A'].isnull()) | (df['B'].isnull() ),np.nan, # check if A or B are np.nan
                         np.where(df['B'].ge(df['A']),'no','yes'))        # check if B >= A

prints:

      A     B is_less
0   NaN  10.0     nan
1  10.0   NaN     nan
2   1.0   5.0      no
3   5.0   1.0     yes

Greater than or equal

pandas.ge

Pandas: How to replace values to np.nan based on Condition for multiple columns

First we create a dictionary from your two lists using zip

replace_dict = dict(zip(list1,list2))

then we loop over it to handle your assignments,

for k,v in replace_dict.items():
    df.loc[df[k] == 0, v] = np.nan

print(df)
   I  A  B  C    D  E    F
0  1  9  4  0    T  F  NaN
1  2  0  5  1  NaN  X    J
2  3  1  8  0    G  G  NaN

another method is to use np.where with your lists.

df[list2] = np.where(df[list1].eq(0), np.nan,df[list2])

print(df)

   I  A  B  C    D  E    F
0  1  9  4  0    T  F  NaN
1  2  0  5  1  NaN  X    J
2  3  1  8  0    G  G  NaN

Python: how to replace NaN with conditions in a dataframe?

you can use groupby to do this:

fill_value = df.groupby("node_i")["value_j"].mean().fillna(1.0)
df["w"] = fill_value.reindex(df["node_i"]).values
df["w"][df["value_j"].notnull()] = df["value_j"][df["value_j"].notnull()]

How to replace a dataframe column values with NaN based on a condition?

Using where and between

df.Age=df.Age.where(df.Age.between(5,100))
df
   ID   Age
0  10   NaN
1  20   NaN
2  30  25.0

How to Replace Nan Values Where the Other Columns Meet a Certain Criteria