Update Row Values Where Certain Condition Is Met in Pandas

Update row values where certain condition is met in pandas

I think you can use loc if you need update two columns to same value:

df1.loc[df1['stream'] == 2, ['feat','another_feat']] = 'aaaa'
print df1
stream feat another_feat
a 1 some_value some_value
b 2 aaaa aaaa
c 2 aaaa aaaa
d 3 some_value some_value

If you need update separate, one option is use:

df1.loc[df1['stream'] == 2, 'feat'] = 10
print df1
stream feat another_feat
a 1 some_value some_value
b 2 10 some_value
c 2 10 some_value
d 3 some_value some_value

Another common option is use numpy.where:

df1['feat'] = np.where(df1['stream'] == 2, 10,20)
print df1
stream feat another_feat
a 1 20 some_value
b 2 10 some_value
c 2 10 some_value
d 3 20 some_value

EDIT: If you need divide all columns without stream where condition is True, use:

print df1
stream feat another_feat
a 1 4 5
b 2 4 5
c 2 2 9
d 3 1 7

#filter columns all without stream
cols = [col for col in df1.columns if col != 'stream']
print cols
['feat', 'another_feat']

df1.loc[df1['stream'] == 2, cols ] = df1 / 2
print df1
stream feat another_feat
a 1 4.0 5.0
b 2 2.0 2.5
c 2 1.0 4.5
d 3 1.0 7.0

If working with multiple conditions is possible use multiple numpy.where
or numpy.select:

df0 = pd.DataFrame({'Col':[5,0,-6]})

df0['New Col1'] = np.where((df0['Col'] > 0), 'Increasing',
np.where((df0['Col'] < 0), 'Decreasing', 'No Change'))

df0['New Col2'] = np.select([df0['Col'] > 0, df0['Col'] < 0],
['Increasing', 'Decreasing'],
default='No Change')

print (df0)
Col New Col1 New Col2
0 5 Increasing Increasing
1 0 No Change No Change
2 -6 Decreasing Decreasing

Updating a column if condition is met with pandas

We can group the dataframe on columns A, B, C along with series of absolute values in column D then transform the column D using sum (because if the pairs have opposite sign then there sum must be zero) to check for the presence of pairs having same magnitude but opposite sign

df['E'] = df.groupby(['A', 'B', 'C', df['D'].abs()])['D'].transform('sum').eq(0) 


      A    B    C      D      E
0 1111 AAA 123 0.01 True
1 2222 BBB 456 5.00 True
2 3333 CCC 789 10.00 False
3 1111 AAA 123 -0.01 True
4 2222 BBB 456 -5.00 True
5 3333 CCC 789 -9.00 False

How to update columns based on a condition

This should work for you:

df['Label'] = np.where(~df['ID'].isin(ids) & (pd.to_datetime(df['Transaction_Date']) > pd.Timestamp.today()), 'Group5', 'Group6')

Output:

>>> df
ID Transaction_Date Label
0 101 NaT Group6
1 101 2021-12-29 Group6
2 102 2021-01-01 Group6
3 102 2021-11-01 Group6
4 103 2021-11-15 Group6
5 104 2021-12-15 Group6
6 105 2021-01-15 Group6

(Note that in your provided dataset, there are no dates greater than today's date, so there are no Group5's in the sample you provided, but I assume that's not true with your real dataset.)

updating column values in pandas based on condition

There is logic problem:

reviews = pd.DataFrame({'Score':range(6)})
print (reviews)
Score
0 0
1 1
2 2
3 3
4 4
5 5

If set all values higher like 3 to 1 it working like need:

reviews.loc[reviews['Score'] > 3, 'Score'] = 1
print (reviews)
Score
0 0
1 1
2 2
3 3
4 1
5 1

Then all vallues without 3 are set to 0, so also are replaced 1 from reviews['Score'] > 3:

reviews.loc[reviews['Score'] <= 2, 'Score'] = 0
print (reviews)
Score
0 0
1 0
2 0
3 3
4 0
5 0

Last are removed 3 rows and get only 0 values:

reviews.drop(reviews[reviews['Score'] == 3].index, inplace = True)
print (reviews)
Score
0 0
1 0
2 0
4 0
5 0

You can change solution:

reviews = pd.DataFrame({'Score':range(6)})
print (reviews)
Score
0 0
1 1
2 2
3 3
4 4
5 5

First removed 3 by filter all rows not equal to 3 in boolean indexing:

reviews = reviews[reviews['Score'] != 3].copy()

And then are set values to 0 and 1:

reviews['Score'] = (reviews['Score'] > 3).astype(int)
#alternative
reviews['Score'] = np.where(reviews['Score'] > 3, 1, 0)
print (reviews)
Score
0 0
1 0
2 0
4 1
5 1

EDIT1:

Your solution should be changed with swap lines - first set 0 and then 1 for avoid overwrite values:

reviews.loc[reviews['Score'] <= 2, 'Score'] = 0
reviews.loc[reviews['Score'] > 3, 'Score'] = 1

reviews.drop(reviews[reviews['Score'] == 3].index, inplace = True)
print (reviews)
Score
0 0
1 0
2 0
4 1
5 1

Pandas DataFrame: replace all values in a column, based on condition

You need to select that column:

In [41]:
df.loc[df['First Season'] > 1990, 'First Season'] = 1
df

Out[41]:
Team First Season Total Games
0 Dallas Cowboys 1960 894
1 Chicago Bears 1920 1357
2 Green Bay Packers 1921 1339
3 Miami Dolphins 1966 792
4 Baltimore Ravens 1 326
5 San Franciso 49ers 1950 1003

So the syntax here is:

df.loc[<mask>(here mask is generating the labels to index) , <optional column(s)> ]

You can check the docs and also the 10 minutes to pandas which shows the semantics

EDIT

If you want to generate a boolean indicator then you can just use the boolean condition to generate a boolean Series and cast the dtype to int this will convert True and False to 1 and 0 respectively:

In [43]:
df['First Season'] = (df['First Season'] > 1990).astype(int)
df

Out[43]:
Team First Season Total Games
0 Dallas Cowboys 0 894
1 Chicago Bears 0 1357
2 Green Bay Packers 0 1339
3 Miami Dolphins 0 792
4 Baltimore Ravens 1 326
5 San Franciso 49ers 0 1003

Pandas - comparing certain columns of two dataframes and updating rows of one if a condition is met

try this:

idx = ['State', 'Organization', 'Date']
res = df1.set_index(idx).copy()
print(res)
>>>
Tag Fine
State Organization Date
MD ABC 01/10/2021 901 0
01/10/2021 801 0
NJ DEF 02/10/2021 701 0
02/10/2021 601 0
02/10/2021 701 0

df2 = df2.set_index(idx)
print(df2)
>>>
Fine
State Organization Date
MD ABC 01/10/2021 1000
01/15/2021 6000
NJ DEF 02/10/2021 900

res.update(df2)
print(res)
>>>

Tag Fine
State Organization Date
MD ABC 01/10/2021 901 1000.0
01/10/2021 801 1000.0
NJ DEF 02/10/2021 701 900.0
02/10/2021 601 900.0
02/10/2021 701 900.0

pd.__version__
>>>
'1.4.1'

pandas : update value if condition met in loop

In pandas there is lookup

df['newvalue']=df.set_index('mode').lookup(df['mode'],df['mode'])
df
Out[184]:
mode car1 car2 bus1 bus2 newcol newvalue
0 car1 10 20 5 2 10 10
1 car2 11 22 3 1 22 22
2 bus1 4 4 2 2 2 2
3 bus2 3 4 3 5 5 5

Group by and update based on condition python pandas

You don't need to groupby:

funct1 = lambda pct: pct.pow(0.5) - pct.pow(0.5).astype(int) == 0
out = df[df['ID'].isin(df.loc[funct1(df['Percentage']), 'ID'])]
>>> out
ID Percentage
0 1 7
1 1 8
2 1 9
3 1 10
4 1 11
5 2 12
6 2 13
7 2 14
8 2 15
9 2 16

Performance

# @BENY
%timeit df.groupby('ID').filter(lambda x : any((x['Percentage']**0.5).map(float.is_integer)))
1.08 ms ± 14.4 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

# @Corralien
%timeit df[df['ID'].isin(df.loc[funct1(df['Percentage']), 'ID'])]
651 µs ± 2.36 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)


Related Topics



Leave a reply



Submit