Dropping Infinite Values from Dataframes in Pandas

Dropping infinite values from dataframes in pandas?

First replace() infs with NaN:

df.replace([np.inf, -np.inf], np.nan, inplace=True)

and then drop NaNs via dropna():

df.dropna(subset=["col1", "col2"], how="all", inplace=True)

For example:

>>> df = pd.DataFrame({"col1": [1, np.inf, -np.inf], "col2": [2, 3, np.nan]})
>>> df
   col1  col2
0   1.0   2.0
1   inf   3.0
2  -inf   NaN

>>> df.replace([np.inf, -np.inf], np.nan, inplace=True)
>>> df
   col1  col2
0   1.0   2.0
1   NaN   3.0
2   NaN   NaN

>>> df.dropna(subset=["col1", "col2"], how="all", inplace=True)
>>> df
   col1  col2
0   1.0   2.0
1   NaN   3.0

The same method also works for Series.

Python pandas: how to remove nan and -inf values

Use pd.DataFrame.isin and check for rows that have any with pd.DataFrame.any. Finally, use the boolean array to slice the dataframe.

df[~df.isin([np.nan, np.inf, -np.inf]).any(1)]

             time    X    Y  X_t0     X_tp0   X_t1     X_tp1   X_t2     X_tp2
4        0.037389    3   10     3  0.333333    2.0  0.500000    1.0  1.000000
5        0.037393    4   10     4  0.250000    3.0  0.333333    2.0  0.500000
1030308  9.962213  256  268   256  0.000000  256.0  0.003906  255.0  0.003922

Remove nan, +inf, -inf values columns from a dataframe

First replace() inf and -inf with nan:

df = pd.DataFrame({'a':[1,2,3], 'b':[4,np.nan,6], 'c':[7,8,np.inf]})
df = df.replace([np.inf, -np.inf], np.nan)

#    a    b    c
# 0  1  4.0  7.0
# 1  2  NaN  8.0
# 2  3  6.0  NaN

Then use the axis param of dropna() to switch between row- and column-based behavior:

df.dropna() # default axis=0 is row-based

#    a    b    c
# 0  1  4.0  7.0

df.dropna(axis=1) # axis=1 or axis='columns' is column-based

#    a
# 0  1
# 1  2
# 2  3

Replace all inf, -inf values with NaN in a pandas dataframe

TL;DR

df.replace is fastest for replacing ±inf
but you can avoid replacing altogether by just setting mode.use_inf_as_na

Replacing `inf` and `-inf`

df = df.replace([np.inf, -np.inf], np.nan)

^{Note that inplace is possible but not recommended and will soon be deprecated.}

Slower df.applymap options:

df = df.applymap(lambda x: np.nan if x in [np.inf, -np.inf] else x)
df = df.applymap(lambda x: np.nan if np.isinf(x) else x)
df = df.applymap(lambda x: x if np.isfinite(x) else np.nan)

Setting `mode.use_inf_as_na`

Note that we don't actually have to modify df at all. Setting mode.use_inf_as_na will simply change the way inf and -inf are interpreted:

True means treat None, nan, -inf, inf as null

False means None and nan are null, but inf, -inf are not null (default)

Either enable globally

pd.set_option('mode.use_inf_as_na', True)

Or locally via context manager

with pd.option_context('mode.use_inf_as_na', True):
    ...

How do you detect and delete infinite values from a time series in a pandas dataframe?

You can try to filter out the infinite values with numpy.inf. The code is following:

import numpy as np
perc_df[perc_df.variable != np.inf].variable.mean()

Python Pandas: For Loop to drop rows from dataframes where values are the same in before/after cases

Make a list containing the dataframes and iterate:

df_list =[*list of dfs]

for df in df_list:
    new_df = df[df['before'] != df['after']]

Then you can append it to a new list... or whatever you want to do with it
If all your dfs are in a dictionary, you iterate as well just index into it:

df_dict = {key0:df0,key1:df1 ....}
for key,df in df_dict.items():
   new_df = df[df['before'] != df['after']]

or even less pythonic:

for key in df_dict.keys():
    df = df_dict[key]
    new_df = df[df['before'] != df['after']]

You can even convert you dictionary values to a list and use the first method:

df_list = list(df_dict.values())

Replacing -inf values to np.nan in a feature pandas.series

The problem may be that you are not assigning back to the original series.

Note that pd.Series.replace is not an in-place operation by default. The below code is a minimal example.

df = pd.DataFrame({'feature': [1, 2, -np.inf, 3, 4]})

df['feature'] = df['feature'].replace(-np.inf, np.nan)

print(df)

#    feature
# 0      1.0
# 1      2.0
# 2      NaN
# 3      3.0
# 4      4.0

Bug: impossible to delete infinite values from DataFrame

Your question is similar to dropping infinite values from dataframes in pandas?,
did you try:

df.replace([np.inf, -np.inf], np.nan).dropna(subset=["col1", "col2"], how="all")

np.nan is not considered as finite, you may replace np.nan by any finite number see that code for example:

import pandas as pd
import numpy as np

df = pd.DataFrame(columns=list('ABC'))
df.loc[0] = [1,np.inf,-np.inf]
print df

print np.all(np.isfinite(df))

df_nan = df.replace([np.inf, -np.inf], np.nan).dropna(subset=df.columns, how="all")
print df_nan

print np.all(np.isfinite(df_nan))

df_0 = df.replace([np.inf, -np.inf], 0).dropna(subset=df.columns, how="all")
print df_0

print np.all(np.isfinite(df_0))

Result:

     A    B    C
0  1.0  inf -inf
False
     A   B   C
0  1.0 NaN NaN
False
     A    B    C
0  1.0  0.0  0.0
True

Dropping Infinite Values from Dataframes in Pandas