Replacing All Negative Values in Certain Columns by Another Value in Pandas

Replacing all negative values in certain columns by another value in Pandas

You can just use indexing by applying a condition statement.

cols = ['T1','T2','T3','T4']
df[df[cols] < 0] = -5

Output

In [35]: df
Out[35]:
T1 T2 T3 T4
0 20 -5 4 3
1 85 -5 34 21
2 -5 22 31 75
3 -5 5 7 -5

In your example you're just replacing the value of variable. You need to replace one cell's value using at method.

for i in df.iloc[:,df.columns.get_loc("T1"):df.columns.get_loc("T1")+4]<0:
for index, j in enumerate(df[i]):
if j<0:
df.at[index, i] = -5

Replacing all negative values in all columns by zero in python

Use pandas.DataFrame.clip:

df.iloc[:, 1:] = df.iloc[:, 1:].clip(0)
print(df)

Output:

             date  T1  T2  T3  T4
0 1-1-2010 00:10 20 0 4 3
1 1-1-2010 00:20 85 0 34 21
2 1-1-2010 00:30 0 22 31 75
3 1-1-2010 00:40 0 5 7 0

Not only clip is faster than mask in your sample, but also in the larger dataset:

# Your sample -> 3x faster
%timeit df.iloc[:, 1:].clip(0)
# 1.74 ms ± 115 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

%timeit df.iloc[:,1:].mask(df.iloc[:,1:] < 0, 0)
# 5.25 ms ± 573 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

# Large Sample -> 1,000,000 elements --> about 30x
large_df = pd.DataFrame(pd.np.random.randint(-5, 5, (1000, 1000)))

%timeit large_df.clip(0)
# 17.2 ms ± 2.44 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

%timeit large_df.mask(large_df< 0, 0)
# 498 ms ± 47 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

Changing Negative Values to 0, Without Changing other Columns

What this, transaction_df_clean.loc[transaction_df_clean['customer_price'] < 0] = 0, is actually doing is applying the condition to the entire dataframe and when you put = 0 the 0 gets broadcasted to all the points of data. You're telling it to select all the rows in your dataframe where customer_price is less than 0 then change all the filtered rows to 0.

Aside from applying the condition you have to select the column/series that you want to change.

How I remember to use .loc is df.loc[row filter/selection, column filter/selection]

Another way to do it would be

transaction_df_clean.loc[transaction_df_clean['customer_price'] < 0,'customer_price'] = 0

There is a good section in the docs about setting values called Setting Values
https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.loc.html

Python Pandas: How to replace negative numbers by prior non-negative numbers?

Similar to your last question:

df['Value'] = df['Value'].where(df['Value'].ge(0)).ffill()


Related Topics



Leave a reply



Submit