Dropping infinite values from dataframes in pandas?
First replace()
infs with NaN:
df.replace([np.inf, -np.inf], np.nan, inplace=True)
and then drop NaNs via dropna()
:
df.dropna(subset=["col1", "col2"], how="all", inplace=True)
For example:
>>> df = pd.DataFrame({"col1": [1, np.inf, -np.inf], "col2": [2, 3, np.nan]})
>>> df
col1 col2
0 1.0 2.0
1 inf 3.0
2 -inf NaN
>>> df.replace([np.inf, -np.inf], np.nan, inplace=True)
>>> df
col1 col2
0 1.0 2.0
1 NaN 3.0
2 NaN NaN
>>> df.dropna(subset=["col1", "col2"], how="all", inplace=True)
>>> df
col1 col2
0 1.0 2.0
1 NaN 3.0
The same method also works for Series
.
Python pandas: how to remove nan and -inf values
Use pd.DataFrame.isin
and check for rows that have any with pd.DataFrame.any
. Finally, use the boolean array to slice the dataframe.
df[~df.isin([np.nan, np.inf, -np.inf]).any(1)]
time X Y X_t0 X_tp0 X_t1 X_tp1 X_t2 X_tp2
4 0.037389 3 10 3 0.333333 2.0 0.500000 1.0 1.000000
5 0.037393 4 10 4 0.250000 3.0 0.333333 2.0 0.500000
1030308 9.962213 256 268 256 0.000000 256.0 0.003906 255.0 0.003922
Remove nan, +inf, -inf values columns from a dataframe
First replace()
inf
and -inf
with nan
:
df = pd.DataFrame({'a':[1,2,3], 'b':[4,np.nan,6], 'c':[7,8,np.inf]})
df = df.replace([np.inf, -np.inf], np.nan)
# a b c
# 0 1 4.0 7.0
# 1 2 NaN 8.0
# 2 3 6.0 NaN
Then use the axis
param of dropna()
to switch between row- and column-based behavior:
df.dropna() # default axis=0 is row-based
# a b c
# 0 1 4.0 7.0
df.dropna(axis=1) # axis=1 or axis='columns' is column-based
# a
# 0 1
# 1 2
# 2 3
Replace all inf, -inf values with NaN in a pandas dataframe
TL;DR
df.replace
is fastest for replacing±inf
- but you can avoid replacing altogether by just setting
mode.use_inf_as_na
Replacing inf
and -inf
df = df.replace([np.inf, -np.inf], np.nan)
Note that inplace
is possible but not recommended and will soon be deprecated.
Slower df.applymap
options:
df = df.applymap(lambda x: np.nan if x in [np.inf, -np.inf] else x)
df = df.applymap(lambda x: np.nan if np.isinf(x) else x)
df = df.applymap(lambda x: x if np.isfinite(x) else np.nan)
Setting mode.use_inf_as_na
Note that we don't actually have to modify df
at all. Setting mode.use_inf_as_na
will simply change the way inf
and -inf
are interpreted:
True
means treatNone
,nan
,-inf
,inf
as nullFalse
meansNone
andnan
are null, butinf
,-inf
are not null (default)
Either enable globally
pd.set_option('mode.use_inf_as_na', True)
Or locally via context manager
with pd.option_context('mode.use_inf_as_na', True):
...
How do you detect and delete infinite values from a time series in a pandas dataframe?
You can try to filter out the infinite values with numpy.inf
. The code is following:
import numpy as np
perc_df[perc_df.variable != np.inf].variable.mean()
Python Pandas: For Loop to drop rows from dataframes where values are the same in before/after cases
Make a list containing the dataframes and iterate:
df_list =[*list of dfs]
for df in df_list:
new_df = df[df['before'] != df['after']]
Then you can append it to a new list... or whatever you want to do with it
If all your dfs are in a dictionary, you iterate as well just index into it:
df_dict = {key0:df0,key1:df1 ....}
for key,df in df_dict.items():
new_df = df[df['before'] != df['after']]
or even less pythonic:
for key in df_dict.keys():
df = df_dict[key]
new_df = df[df['before'] != df['after']]
You can even convert you dictionary values to a list and use the first method:
df_list = list(df_dict.values())
Replacing -inf values to np.nan in a feature pandas.series
The problem may be that you are not assigning back to the original series.
Note that pd.Series.replace
is not an in-place operation by default. The below code is a minimal example.
df = pd.DataFrame({'feature': [1, 2, -np.inf, 3, 4]})
df['feature'] = df['feature'].replace(-np.inf, np.nan)
print(df)
# feature
# 0 1.0
# 1 2.0
# 2 NaN
# 3 3.0
# 4 4.0
Bug: impossible to delete infinite values from DataFrame
Your question is similar to dropping infinite values from dataframes in pandas?,
did you try:
df.replace([np.inf, -np.inf], np.nan).dropna(subset=["col1", "col2"], how="all")
np.nan
is not considered as finite
, you may replace np.nan
by any finite number
see that code for example:
import pandas as pd
import numpy as np
df = pd.DataFrame(columns=list('ABC'))
df.loc[0] = [1,np.inf,-np.inf]
print df
print np.all(np.isfinite(df))
df_nan = df.replace([np.inf, -np.inf], np.nan).dropna(subset=df.columns, how="all")
print df_nan
print np.all(np.isfinite(df_nan))
df_0 = df.replace([np.inf, -np.inf], 0).dropna(subset=df.columns, how="all")
print df_0
print np.all(np.isfinite(df_0))
Result:
A B C
0 1.0 inf -inf
False
A B C
0 1.0 NaN NaN
False
A B C
0 1.0 0.0 0.0
True
Related Topics
Typeerror: Can Only Concatenate Str (Not "Float") to Str
What Is the Fastest Way to Open Urls in New Tabs via Selenium - Python
Python - Pygame Error When Executing Exe File
Why Does This Not Work as an Array Membership Test
How to Break a Long Line to Multiple Lines in Python
Get Name of Current Script in Python
How to Implement the Softmax Function in Python
Pandas Index Column Title or Name
Configuring So That Pip Install Can Work from Github
How to Implement a Binary Tree
How to Turn Off Info Logging in Spark
How to Use Python to Execute a Curl Command
Is There Any Difference Between "Foo Is None" and "Foo == None"
How to Scrape a Website Which Requires Login Using Python and Beautifulsoup