Fillna in Multiple Columns in Place in Python Pandas

Fillna in multiple columns in place in Python Pandas

You could use apply for your columns with checking dtype whether it's numeric or not by checking dtype.kind:

res = df.apply(lambda x: x.fillna(0) if x.dtype.kind in 'biufc' else x.fillna('.'))

print(res)
A B City Name
0 1.0 0.25 Seattle Jack
1 2.1 0.00 SF Sue
2 0.0 0.00 LA .
3 4.7 4.00 OC Bob
4 5.6 12.20 . Alice
5 6.8 14.40 . John

Using fillna method on multiple columns of a Pandas DataFrame failed

These answers are guided by the fact that OP wanted an in place edit of an existing dataframe. Usually, I overwrite the existing dataframe with a new one.


Use pandas.DataFrame.fillna with a dict

Pandas fillna allows us to pass a dictionary that specifies which columns will be filled in and with what.

So this will work

a.fillna({'a': 0, 'b': 0})

a b c
0 1.0 5.0 5
1 2.0 0.0 1
2 0.0 6.0 5
3 0.0 0.0 2

With an in place edit made possible with:

a.fillna({'a': 0, 'b': 0}, inplace=True)

NOTE: I would've just done this a = a.fillna({'a': 0, 'b': 0})

We don't save text length but we could get cute using dict.fromkeys

a.fillna(dict.fromkeys(['a', 'b'], 0), inplace=True)

loc

We can use the same format as the OP but place it in the correct columns using loc

a.loc[:, ['a', 'b']] = a[['a', 'b']].fillna(0)

a

a b c
0 1.0 5.0 5
1 2.0 0.0 1
2 0.0 6.0 5
3 0.0 0.0 2

pandas.DataFrame.update

Explicitly made to make in place edits with the non-null values of another dataframe

a.update(a[['a', 'b']].fillna(0))

a

a b c
0 1.0 5.0 5
1 2.0 0.0 1
2 0.0 6.0 5
3 0.0 0.0 2

Iterate column by column

I really don't like this approach because it is unnecessarily verbose

for col in ['a', 'b']:
a[col].fillna(0, inplace=True)

a

a b c
0 1.0 5.0 5
1 2.0 0.0 1
2 0.0 6.0 5
3 0.0 0.0 2

fillna with a dataframe

Use the result of a[['a', 'b']].fillna(0) as the input for another fillna. In my opinion, this is silly. Just use the first option.

a.fillna(a[['a', 'b']].fillna(0), inplace=True)

a

a b c
0 1.0 5.0 5
1 2.0 0.0 1
2 0.0 6.0 5
3 0.0 0.0 2

Pandas fillna multiple columns with values from corresponding columns without repeating for each

  • you can use **kwargs to assign()
  • build up a dict with a comprehension to build **kwargs
import pandas as pd
import numpy as np
x = pd.DataFrame({'col1_x': [15, np.nan, 136, 93, 743, np.nan, np.nan, 91] ,
'col2_x': [np.nan, np.nan, 51, 22, 38, np.nan, 72, np.nan],
'col1_y': [10, 20, 30, 40, 50, 60, 70, 80],
'col2_y': [93, 24, 52, 246, 142, 53, 94, 2]})

x.assign(**{c:x[c].fillna(x[c.replace("_x","_y")]) for c in x.columns if "_x" in c})





































































col1_xcol2_xcol1_ycol2_y
015931093
120242024
2136513052
3932240246
47433850142
560536053
670727094
7912802

Fillna by relating multiple columns using a function

You can use np.select with str.contains conditions:

conditions = {
30: df.id.str.contains('[^7][ABC]$'),
50: df.id.str.contains('7[ABC]$'),
20: df.id.str.contains('[EFG]$'),
10: df.id.str.contains('[OMN]$'),
}
df.price = np.select(conditions.values(), conditions.keys())

# object id price
# 0 laptop 24A 30
# 1 laptop 37C 50
# 2 laptop 21O 10
# 3 laptop 17C 50
# 4 laptop 55A 30
# 5 laptop 34N 10
# 6 laptop 05E 20
# 7 laptop 29B 30
# 8 laptop 22M 10
# 9 laptop 62F 20
# 10 laptop 23G 20
# 11 laptop 61O 10
# 12 laptop 27A 50

You could also use loc masking if you want to use fillna:

for price, condition in conditions.items():
df.loc[condition, 'price'] = df.loc[condition, 'price'].fillna(price)


Update 1

If you want to further restrict by df.object, you can add the df.object condition with &:

conditions = {
30: df.object.eq('laptop') & df.id.str.contains('[^7][ABC]$'),
50: df.object.eq('laptop') & df.id.str.contains('7[ABC]$'),
20: df.object.eq('laptop') & df.id.str.contains('[EFG]$'),
10: df.object.eq('laptop') & df.id.str.contains('[OMN]$'),
1000: df.object.eq('phone') & df.id.str.contains('[OMN]$'),
}


Update 2

If you really want to use a function, you can apply along rows (axis=1), but row-apply is much slower and not advised when you have vectorized options like np.select:

def price(row):
result = np.nan
if row.object == 'laptop':
if row.id[-2:] in ['7A', '7B', '7C']:
result = 50
elif row.id[-1] in list('ABC'):
result = 30
elif row.id[-1] in list('EFG'):
result = 20
elif row.id[-1] in list('OMN'):
result = 10
elif row.object == 'phone':
if row.id[-2:] in ['7A', '7B', '7C']:
result = 5000
...
return result
df.price = df.apply(price, axis=1)

How do I fill NA values in multiple columns in pandas?

you can use update():

In [145]: df
Out[145]:
a b c d e
0 NaN NaN NaN 3 8
1 NaN NaN NaN 8 7
2 NaN NaN NaN 2 8
3 NaN NaN NaN 7 4
4 NaN NaN NaN 4 9
5 NaN NaN NaN 1 9
6 NaN NaN NaN 7 7
7 NaN NaN NaN 6 5
8 NaN NaN NaN 0 0
9 NaN NaN NaN 9 5

In [146]: df.update(df[['a','b','c']].fillna(0))

In [147]: df
Out[147]:
a b c d e
0 0.0 0.0 0.0 3 8
1 0.0 0.0 0.0 8 7
2 0.0 0.0 0.0 2 8
3 0.0 0.0 0.0 7 4
4 0.0 0.0 0.0 4 9
5 0.0 0.0 0.0 1 9
6 0.0 0.0 0.0 7 7
7 0.0 0.0 0.0 6 5
8 0.0 0.0 0.0 0 0
9 0.0 0.0 0.0 9 5

Fill Na in multiple columns with values from another column within the pandas data frame

1) How to fill na values in columns BandC using values from column A from the given data frame ?

Because replace by column is not implemented, possible solution is double transpose:

df[['B','C']] = df[['B','C']].T.fillna(df['A']).T
print (df)
A B C D E
0 0.1 2.0 55.0 0 NaN
1 0.2 4.0 0.2 1 99.0
2 0.3 0.3 22.0 5 88.0
3 0.4 0.4 0.4 4 77.0

Or:

m = df[['B','C']].isna()
df[['B','C']] = df[['B','C']].mask(m, m.astype(int).mul(df['A'], axis=0))
print (df)
A B C D E
0 0.1 2.0 55.0 0 NaN
1 0.2 4.0 0.2 1 99.0
2 0.3 0.3 22.0 5 88.0
3 0.4 0.4 0.4 4 77.0

2) Also why is inlace not working when using fillna on a subset of the data frame.

I think reason is chained assignments, need assign back.

3) How to do ffill along the rows(is it implemented)?

Replace by forward filling working nice, if assign back:

df1 = df.fillna(method='ffill',axis=1)
print (df1)
A B C D E
0 0.1 2.0 55.0 0.0 0.0
1 0.2 4.0 4.0 1.0 99.0
2 0.3 0.3 22.0 5.0 88.0
3 0.4 0.4 0.4 4.0 77.0

df2 = df.fillna(method='ffill',axis=0)
print (df2)
A B C D E
0 0.1 2.0 55.0 0 NaN
1 0.2 4.0 55.0 1 99.0
2 0.3 4.0 22.0 5 88.0
3 0.4 4.0 22.0 4 77.0

Pandas: If condition on multiple columns having null values and fillna with 0

You can just assign it with update

c = ['Age','Salary']
df.update(df.loc[~df[c].isna().all(1),c].fillna(0))

df
Out[341]:
Age Salary
0 0.0 217.0
1 NaN NaN
2 22.0 262.0
3 0.0 352.0
4 50.0 570.0
5 99.0 0.0

Why does pandas fillna() inplace does not work for multiple columns?

Try to chain operations and return a copy of values rather than modify inplace:

data[var_categor] = data.replace('?', np.nan)[var_categor].fillna('Missing')
>>> data[var_categor].isna().sum()
sex 0
cabin 0
embarked 0
dtype: int64

Pandas : Fillna for all columns, except two

You can select which columns to use fillna on. Assuming you have 20 columns and you want to fill all of them except 'col1' and 'col2' you can create a list with the ones you want to fill:

f = [c for c in df.columns if c not in ['col1','col2']]
df[f] = df[f].fillna(df[f].mean())

print(df)

col1 col2 col3 col4 ... col17 col18 col19 col20
0 1.0 1.0 1.000000 1.0 ... 1.000000 1 1.000000 1
1 NaN NaN 2.666667 2.0 ... 2.000000 2 2.000000 2
2 NaN 3.0 3.000000 1.5 ... 2.333333 3 2.333333 3
3 4.0 4.0 4.000000 1.5 ... 4.000000 4 4.000000 4

(2.66666) was the mean


# Initial DF:

{'col1': {0: 1.0, 1: nan, 2: nan, 3: 4.0},
'col2': {0: 1.0, 1: nan, 2: 3.0, 3: 4.0},
'col3': {0: 1.0, 1: nan, 2: 3.0, 3: 4.0},
'col4': {0: 1.0, 1: 2.0, 2: nan, 3: nan},
'col5': {0: 1.0, 1: nan, 2: 3.0, 3: 4.0},
'col6': {0: 1, 1: 2, 2: 3, 3: 4},
'col7': {0: nan, 1: 2.0, 2: 3.0, 3: 4.0},
'col8': {0: 1.0, 1: 2.0, 2: nan, 3: 4.0},
'col9': {0: 1, 1: 2, 2: 3, 3: 4},
'col10': {0: 1.0, 1: 2.0, 2: nan, 3: 4.0},
'col11': {0: 1.0, 1: nan, 2: 3.0, 3: 4.0},
'col12': {0: 1, 1: 2, 2: 3, 3: 4},
'col13': {0: 1.0, 1: 2.0, 2: nan, 3: 4.0},
'col14': {0: 1.0, 1: nan, 2: 3.0, 3: 4.0},
'col15': {0: 1, 1: 2, 2: 3, 3: 4},
'col16': {0: 1.0, 1: nan, 2: 3.0, 3: nan},
'col17': {0: 1.0, 1: 2.0, 2: nan, 3: 4.0},
'col18': {0: 1, 1: 2, 2: 3, 3: 4},
'col19': {0: 1.0, 1: 2.0, 2: nan, 3: 4.0},
'col20': {0: 1, 1: 2, 2: 3, 3: 4}}


Related Topics



Leave a reply



Submit