How to replace negative numbers in Pandas Data Frame by zero
If all your columns are numeric, you can use boolean indexing:
In [1]: import pandas as pd
In [2]: df = pd.DataFrame({'a': [0, -1, 2], 'b': [-3, 2, 1]})
In [3]: df
Out[3]:
a b
0 0 -3
1 -1 2
2 2 1
In [4]: df[df < 0] = 0
In [5]: df
Out[5]:
a b
0 0 0
1 0 2
2 2 1
For the more general case, this answer shows the private method _get_numeric_data
:
In [1]: import pandas as pd
In [2]: df = pd.DataFrame({'a': [0, -1, 2], 'b': [-3, 2, 1],
'c': ['foo', 'goo', 'bar']})
In [3]: df
Out[3]:
a b c
0 0 -3 foo
1 -1 2 goo
2 2 1 bar
In [4]: num = df._get_numeric_data()
In [5]: num[num < 0] = 0
In [6]: df
Out[6]:
a b c
0 0 0 foo
1 0 2 goo
2 2 1 bar
With timedelta
type, boolean indexing seems to work on separate columns, but not on the whole dataframe. So you can do:
In [1]: import pandas as pd
In [2]: df = pd.DataFrame({'a': pd.to_timedelta([0, -1, 2], 'd'),
...: 'b': pd.to_timedelta([-3, 2, 1], 'd')})
In [3]: df
Out[3]:
a b
0 0 days -3 days
1 -1 days 2 days
2 2 days 1 days
In [4]: for k, v in df.iteritems():
...: v[v < 0] = 0
...:
In [5]: df
Out[5]:
a b
0 0 days 0 days
1 0 days 2 days
2 2 days 1 days
Update: comparison with a pd.Timedelta
works on the whole DataFrame:
In [1]: import pandas as pd
In [2]: df = pd.DataFrame({'a': pd.to_timedelta([0, -1, 2], 'd'),
...: 'b': pd.to_timedelta([-3, 2, 1], 'd')})
In [3]: df[df < pd.Timedelta(0)] = 0
In [4]: df
Out[4]:
a b
0 0 days 0 days
1 0 days 2 days
2 2 days 1 days
Replace negative values in single DataFrame column
I think you can using mask
df_1.B=df_1.B.mask(df_1.B.lt(0),0)
df_1
Out[1437]:
A B C
2017-01-01 00:01:00 -1 4 7
2017-01-01 00:02:00 2 0 8
2017-01-02 00:01:00 3 6 -9
If we combine with fillna
()Assuming different columns should fill will different value)
df_1.mask(df_1.lt(0)).fillna({'A':9999,'B':0,'C':-9999})
Out[1440]:
A B C
2017-01-01 00:01:00 9999.0 4.0 7.0
2017-01-01 00:02:00 2.0 0.0 8.0
2017-01-02 00:01:00 3.0 6.0 -9999.0
Replacing all negative values in certain columns by another value in Pandas
You can just use indexing
by applying a condition statement.
cols = ['T1','T2','T3','T4']
df[df[cols] < 0] = -5
Output
In [35]: df
Out[35]:
T1 T2 T3 T4
0 20 -5 4 3
1 85 -5 34 21
2 -5 22 31 75
3 -5 5 7 -5
In your example you're just replacing the value of variable. You need to replace one cell's value using at
method.
for i in df.iloc[:,df.columns.get_loc("T1"):df.columns.get_loc("T1")+4]<0:
for index, j in enumerate(df[i]):
if j<0:
df.at[index, i] = -5
Pandas DataFrame replace negative values with latest preceding positive value
You can use DataFrame.mask
to convert all values < 0
to NaN
then use ffill
and fillna
:
df = df.mask(df.lt(0)).ffill().fillna(0).convert_dtypes()
a b c
0 1 0 0
1 1 0 0
2 0 0 0
3 3 0 4
4 3 0 5
5 2 0 3
Replace negative values in pandas Series
Use pd.to_numeric
+ Series.lt
to create a boolean mask, then use this mask
to substitue 0
values in the series:
mask = pd.to_numeric(s, errors='coerce').lt(0)
s.loc[mask] = 0
Result:
val1 a
val2 b
other_val1 1
other_val2 0
other_val3 3
other_val4 0
dtype: object
Python Pandas: How to replace negative numbers by prior non-negative numbers?
Similar to your last question:
df['Value'] = df['Value'].where(df['Value'].ge(0)).ffill()
Related Topics
Adding a New Column Based Upon Values in Another Column Using Dplyr
How to Dplyr Rename a Column, by Column Index
Subtracting Two Columns to Give a New Column in R
How to Declare a Vector of Zeros in R
How to View the Source Code For a Function
What Are the Differences Between "=" and "≪-" Assignment Operators
Relative Frequencies/Proportions With Dplyr
Installing Older Version of R Package
How to Do Vlookup and Fill Down (Like in Excel) in R
Create Counter Within Consecutive Runs of Certain Values
Divide All Columns by the Value from the 2Nd Column - Apply for All Rows
Splitting a Dataframe into Several Dataframes
Remove Unwanted Symbols from Expression Function - R
Counting Unique Values Across Variables (Columns) in R
How to Import Multiple .Csv Files At Once
Evaluate Expression Given as a String
Replace Missing Values (Na) With Most Recent Non-Na by Group
Selecting Data Frame Rows Based on Partial String Match in a Column