How to replace negative numbers in Pandas Data Frame by zero
If all your columns are numeric, you can use boolean indexing:
In [1]: import pandas as pd
In [2]: df = pd.DataFrame({'a': [0, -1, 2], 'b': [-3, 2, 1]})
In [3]: df
Out[3]:
a b
0 0 -3
1 -1 2
2 2 1
In [4]: df[df < 0] = 0
In [5]: df
Out[5]:
a b
0 0 0
1 0 2
2 2 1
For the more general case, this answer shows the private method _get_numeric_data
:
In [1]: import pandas as pd
In [2]: df = pd.DataFrame({'a': [0, -1, 2], 'b': [-3, 2, 1],
'c': ['foo', 'goo', 'bar']})
In [3]: df
Out[3]:
a b c
0 0 -3 foo
1 -1 2 goo
2 2 1 bar
In [4]: num = df._get_numeric_data()
In [5]: num[num < 0] = 0
In [6]: df
Out[6]:
a b c
0 0 0 foo
1 0 2 goo
2 2 1 bar
With timedelta
type, boolean indexing seems to work on separate columns, but not on the whole dataframe. So you can do:
In [1]: import pandas as pd
In [2]: df = pd.DataFrame({'a': pd.to_timedelta([0, -1, 2], 'd'),
...: 'b': pd.to_timedelta([-3, 2, 1], 'd')})
In [3]: df
Out[3]:
a b
0 0 days -3 days
1 -1 days 2 days
2 2 days 1 days
In [4]: for k, v in df.iteritems():
...: v[v < 0] = 0
...:
In [5]: df
Out[5]:
a b
0 0 days 0 days
1 0 days 2 days
2 2 days 1 days
Update: comparison with a pd.Timedelta
works on the whole DataFrame:
In [1]: import pandas as pd
In [2]: df = pd.DataFrame({'a': pd.to_timedelta([0, -1, 2], 'd'),
...: 'b': pd.to_timedelta([-3, 2, 1], 'd')})
In [3]: df[df < pd.Timedelta(0)] = 0
In [4]: df
Out[4]:
a b
0 0 days 0 days
1 0 days 2 days
2 2 days 1 days
Replace dataframe column negative values with nan, in method chain
If assign
counts as a method on df, you can recalculate the column b
and assign it to df
to replace the old column:
df = pd.DataFrame({'a': [1, 2] , 'b': [-3, 4], 'c': [5, -6]})
df.assign(b = df.b.where(df.b.ge(0)))
# a b c
#0 1 NaN 5
#1 2 4.0 -6
For better chaining behavior, you can use lambda
function with assign
:
df.assign(b = lambda x: x.b.where(x.b.ge(0)))
What is the fastest way to replace negative values with 0 and values greater than 1 with 1 in an array using Python?
You want to use np.clip
:
>>> import numpy as np
>>> list_values = [-0.01, 0, 0.5, 0.9, 1.0, 1.01]
>>> arr = np.array(list_values)
>>> np.clip(arr, 0.0, 1.0)
array([0. , 0. , 0.5, 0.9, 1. , 1. ])
This is likely the fastest approach, if you can ignore the cost of converting to an array. Should be a lot better for larger lists/arrays.
Involving pandas
in this operation isn't the way to go unless you eventually want a pandas data structure.
Replacing positive, negative, and zero values by 1, -1, and 0 respectively
There's a sign
function in numpy:
df["trade_sign"] = np.sign(df["diff"])
If you want integers,
df["trade_sign"] = np.sign(df["diff"]).astype(int)
How to replace zeros in Pandas Data Frame by negative 1
I believe you have a data type of 8 bit unsigned integer. For that data type, there are no negatives and therefore a -1
overflows(underflows?) to the largest such number.
df = pd.DataFrame([[0, 1], [1, 0]], dtype=np.uint8)
df.replace(0, -1)
0 1
0 255 1
1 1 255
Where 255
is the largest such number.
np.iinfo(np.uint8).max
255
Instead, set the data type first
df.astype(int).replace(0, -1)
0 1
0 -1 1
1 1 -1
Related Topics
Python 2D List Performance, Without Numpy
Pythonically Add Header to a CSV File
Python Pandas Read_Excel() Module Not Found
If-Condition With Multiple Actions in Robot Framework
Valueerror: X and Y Must Be the Same Size
How to Check Url Change With Selenium in Python
Split String in a Spark Dataframe Column by Regular Expressions Capturing Groups
_Tkinter.Tclerror: No Display Name and No $Display Environment Variable
Most Efficient Way to Construct Similarity Matrix
Pandas - Tokenizing Data Expected 1 Field Saw Multiple
Split List into Lists Based on a Character Occurring Inside of an Element
Importing Large Tab-Delimited .Txt File into Python
Format/Suppress Scientific Notation from Pandas Aggregation Results
How to Update/Delete Rows in Bigquery from the Python API
Getting S3 Objects' Last Modified Datetimes With Boto
Get Current Url from Browser Using Python