How to Replace Nan Values by Zeroes in a Column of a Pandas Dataframe

How to replace NaN values by Zeroes in a column of a Pandas Dataframe?

I believe DataFrame.fillna() will do this for you.

Link to Docs for a dataframe and for a Series.

Example:

In [7]: df
Out[7]:
0 1
0 NaN NaN
1 -0.494375 0.570994
2 NaN NaN
3 1.876360 -0.229738
4 NaN NaN

In [8]: df.fillna(0)
Out[8]:
0 1
0 0.000000 0.000000
1 -0.494375 0.570994
2 0.000000 0.000000
3 1.876360 -0.229738
4 0.000000 0.000000

To fill the NaNs in only one column, select just that column. in this case I'm using inplace=True to actually change the contents of df.

In [12]: df[1].fillna(0, inplace=True)
Out[12]:
0 0.000000
1 0.570994
2 0.000000
3 -0.229738
4 0.000000
Name: 1

In [13]: df
Out[13]:
0 1
0 NaN 0.000000
1 -0.494375 0.570994
2 NaN 0.000000
3 1.876360 -0.229738
4 NaN 0.000000

EDIT:

To avoid a SettingWithCopyWarning, use the built in column-specific functionality:

df.fillna({1:0}, inplace=True)

Function to replace all NaN values with zero:

Use boolean mask.

Suppose the following dataframe:

>>> df
A B C
0 0.0 1 2.0
1 NaN 4 5.0 # <- NaN should be replace by 0.1
2 6.0 7 NaN # <- NaN should be replace by 0
m1 = df.isna().any()  # Is there a NaN in columns (not mandatory)
m2 = df.eq(0).any() # Is there a 0 in columns

# Replace by 0
df.update(df.loc[:, m1 & ~m2].fillna(0))

# Replace by 0.1
df.update(df.loc[:, m1 & m2].fillna(0.1))

Only the second mask is useful

Output result:

>>> df
A B C
0 0.0 1 2.0
1 0.1 4 5.0
2 6.0 7 0.0

I want to replace NaN values with 0 but not able to with the below code

In your code you passed to_replace="NaN".

Note that you actually passed here a string containing just these 3 letters.

In Pandas you can pass np.nan, but only as the value to be assigned
to a cell in a DataFrame. The same pertains to a Numpy array.

You can not pass to_replace=np.nan, because the comparison rules are
that one np.nan is NOT equal to another np.nan.

One of possible solutions is to run:

df2 = df2.where(~df2.isna(), 0)

Other, simpler solution, as richardec suggested, is to use fillna,
but the argument should be 0 (zero) not "o" (a char):

df2 = df2.fillna(0)

Replacing nan values in a Pandas data frame with lists

You have to handle the three cases (empty string, NaN, NaN in list) separately.

For the NaN in list you need to loop over each occurrence and replace the elements one by one.

NB. applymap is slow, so if you know in advance the columns to use you can subset them

For the empty string, replace them to NaN, then fillna.

sub = 'X'
(df.applymap(lambda x: [sub if (pd.isna(e) or e=='')
else e for e in x]
if isinstance(x, list) else x)
.replace('', float('nan'))
.fillna(sub)
)

Output:

  col1  col2       col3    col4
0 X Jhon [X, 1, 2] [k, j]
1 1.0 X [1, 1, 5] 3
2 2.0 X X X
3 3.0 Samy [1, 1, X] [b, X]

Used input:

from numpy import nan
df = pd.DataFrame({'col1': {0: nan, 1: 1.0, 2: 2.0, 3: 3.0},
'col2': {0: 'Jhon', 1: nan, 2: '', 3: 'Samy'},
'col3': {0: [nan, 1, 2], 1: [1, 1, 5], 2: nan, 3: [1, 1, nan]},
'col4': {0: ['k', 'j'], 1: '3', 2: nan, 3: ['b', '']}})

Pandas replace NaN values with zeros after pivot operation

I think problem is NaN are strings, so cannot replace them, so first try convert valus to numeric:

df['Rain (mm)'] = pd.to_numeric(df['Rain (mm)'], errors='coerce')

df = df.pivot_table(index=['Month', 'Day'], columns='Year',
values='Rain (mm)', aggfunc='first').fillna(0)

Replace null values in pandas data frame column with 2D np.zeros() array

It is cause by the object data type we have a way with fillna

df.val.fillna(dict(zip(df.index[df['val'].isnull()],[z]*df['val'].isnull().sum())),inplace=True)
df
val
0 [[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,...
1 2.0
2 3.0
3 [[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,...
4 5.0

How to conditionally replace NaN values in a dataframe?

fillna can take a series to replace NaN values with. Non-NaN values are left untouched.

Replace the month numbers with the values from your dictionary with map, then pass the result to fillna:

df["WL1"] = df.WL1.fillna(df.Month.map(dictionary["WL1"]))

Replace 0 with NaN for selected columns only if all values are 0 in Pandas

Use mask:

df[cols] = df[cols].mask(df[cols].eq(0).all(axis=1))

mask automatically sets the row to NaN if the condition (df[cols].eq(0).all(axis=1)) is True.

Original answer:

I'd prefer mask:

>>> df.set_index('id').mask(df[cols].eq(0).all(axis=1))
value1 value2 value3
id
0 22.0 1.0 7.0
1 NaN NaN NaN
2 NaN NaN NaN
3 4.0 1.0 25.0
4 5.0 0.0 24.0
5 0.0 0.0 3.0
>>>

With resetting index:

>>> df.set_index('id').mask(df[cols].eq(0).all(axis=1)).reset_index()
id value1 value2 value3
0 0 22.0 1.0 7.0
1 1 NaN NaN NaN
2 2 NaN NaN NaN
3 3 4.0 1.0 25.0
4 4 5.0 0.0 24.0
5 5 0.0 0.0 3.0
>>>


Related Topics



Leave a reply



Submit