Pandas: How to Assign Values Based on Multiple Conditions for Existing Columns

Pandas: How do I assign values based on multiple conditions for existing columns?

You can do this using np.where, the conditions use bitwise & and | for and and or with parentheses around the multiple conditions due to operator precedence. So where the condition is true 5 is returned and 0 otherwise:

In [29]:
df['points'] = np.where( ( (df['gender'] == 'male') & (df['pet1'] == df['pet2'] ) ) | ( (df['gender'] == 'female') & (df['pet1'].isin(['cat','dog'] ) ) ), 5, 0)
df

Out[29]:
gender pet1 pet2 points
0 male dog dog 5
1 male cat cat 5
2 male dog cat 0
3 female cat squirrel 5
4 female dog dog 5
5 female squirrel cat 0
6 squirrel dog cat 0

Assign value of existing column to new columns in pandas based on multiple conditions

From your DataFrame :

>>> import pandas as pd
>>> from io import StringIO

>>> df = pd.read_csv(StringIO("""
... column1,column2,column3,y1,y2,y3
... 100,200,300,2020,2021,2022
... 100,200,300,2021,2022,2023
... 100,200,300,2019,2020,2021"""))
>>> df
column1 column2 column3 y1 y2 y3
0 100 200 300 2020 2021 2022
1 100 200 300 2021 2022 2023
2 100 200 300 2019 2020 2021

And the function assignvalues, which now return the value from the expected column for each if. We set the currentyear at 2021 for example :

>>> def assignvalues(df):
... if df['y1'] == currentyear:
... return df['column1']
... elif df['y2'] == currentyear:
... return df['column2']
... elif df['y3'] == currentyear:
... return df['column3']

>>> currentyear = 2021

We can assign to df["Vals"] an apply(), as you did, with an axis=1 parameter to get the expected result :

>>> df["Vals"] = df.apply(assignvalues, axis=1)
>>> df
column1 column2 column3 y1 y2 y3 Vals
0 100 200 300 2020 2021 2022 200
1 100 200 300 2021 2022 2023 100
2 100 200 300 2019 2020 2021 300

change column value based on multiple conditions

You are really close, assign value Matt to filtered A by boolean masks:

df.loc[(df['A']=='Harry') & (df['B']=='George') & (df['C']>'2019'),'A'] = 'Matt'

Assign numeric values for multiple columns based on multiple conditions in pandas DataFrame

You could apply pd.cut to the relevant columns:

cols = ['Procedures1', 'Procedures2']
df[cols] = df[cols].apply(lambda col: pd.cut(col, [0,200,500,1000, col.max()], labels=[1,2,3,4]))

Output:

  Therapy_area Procedures1 Procedures2
0 Oncology 2 2
1 Oncology 2 2
2 Oncology 1 1
3 Oncology 3 3
4 Oncology 4 4
5 Oncology 4 4
6 Nononcology 2 2
7 Nononcology 2 2
8 Nononcology 2 2
9 Nononcology 1 1

You could also use np.select:

def encoding(col, labels):
return np.select([col<200, col.between(200,500), col.between(500,1000), col>1000], labels, 0)

onc_labels = [1,2,3,4]
nonc_labels = [11,22,33,44]
msk = df['Therapy_area'] == 'Oncology'

df[cols] = pd.concat((df.loc[msk, cols].apply(encoding, args=(onc_labels,)), df.loc[msk, cols].apply(encoding, args=(nonc_labels,)))).reset_index(drop=True)

Output:

  Therapy_area  Procedures1  Procedures2  Procedures3
0 Oncology 2 2 4
1 Oncology 2 2 2
2 Oncology 1 1 4
3 Oncology 3 3 2
4 Oncology 4 4 1
5 Oncology 4 4 2
6 Nononcology 22 22 44
7 Nononcology 22 22 22
8 Nononcology 11 11 44
9 Nononcology 33 33 22

Pandas - Assign value to subset of dataframe, based on multiple conditions

Use isin and map:

df.loc[df['Market'].isin(['Mk 1', 'Mk1']), 'Sub Market'] = df['Symbol'].isin(dct).map({True:'A', False:'B'})

Output:

>>> df
Market Sub Market Symbol
0 Mk1 A ABC
1 Mk 1 A ABC
2 Mk 1 B 123
3 Mk 2 B 123
4 Mk 3 A XYZ

Pandas - Trying to assign values to dataframe based on multiple conditions

We need two conditions

df.loc[df['field1'].isnull() & df['field3'].isnull(), 'fieldTemp'] = 0

How to set values of a column based on multiple conditions in other columns in python?

You're missing parenthesis when defining the conditions. The reason behind this is that bitwise operators have higher precedence than comparissons. Instead use:

m1 = (df.col1 >= 1) & (df.col2 >= 1) & (df.col3 >= 1) & 
(df.col4 >= 1) & (df.col5 >= 1)
m2 = (df.col2 >= 1) & (df.col3 >= 1) & (df.col4 >= 1) & (df.col5 >= 1)
m3 = (df.col3 >= 1) & (df.col4 >= 1) & (df.col5 >= 1)

df['category'] = np.select([m1, m2, m3], ['certain', 'possible', 'probable'],
default='Other')

Which results in the expected output:

    col1  col2  col3  col4  col5  category
0 1 1 1 4 1 certain
1 0 1 1 1 1 possible
2 0 0 1 1 1 probable

Use multiple conditions on a column to assign values of new column

There's no need for itterrows here, which is bad practice and considered slow.

Method 1 pd.cut

df['B'] = pd.cut(df['A'], [0,1,4,10], labels=['low', 'mid', 'high'])

A B
0 1 low
1 1 low
2 2 mid
3 3 mid
4 5 high
5 4 mid
6 2 mid
7 5 high

Method 2 np.select

conditions = [
df['A'] == 1,
df['A'].isin([2, 3, 4])
]

choices = ['low', 'mid']

df['B'] = np.select(conditions, choices, default='high')

A B
0 1 low
1 1 low
2 2 mid
3 3 mid
4 5 high
5 4 mid
6 2 mid
7 5 high

Assign a dataframe column a value, based on multiple conditions

We can use cut

transform(House, newcol = cut(price, breaks = c(-Inf, 300000, 500000, Inf),
labels = c("red", "blue", "green")))
# price newcol
#1 287655 red
#2 456355 blue
#3 662500 green
#4 597864 green
#5 876545 green

Note that if/else is not vectorized and it expects the input to have length of 1. If we are doing in this a loop with each element having length 1, it works, but it is also inefficient as there is ifelse vectorized version of if/else

House <- transform(House, newcol = ifelse(price < 300000, "red",
ifelse(price > 300000 & price < 500000, "blue", "green")))
House
# price newcol
#1 287655 red
#2 456355 blue
#3 662500 green
#4 597864 green
#5 876545 green

If we look at the results, both of them got the same output, but the difference is in the number of ifelse statements which can increase when there are more number of comparisons. It would be better to use cut or findInterval instead of nested ifelse


if goes with else rather than then

House$newcol <- NA
for(i in seq_len(nrow(House))) {
House$newcol[i] <- if(House$price[i] < 300000) {
'red'
} else if( House$price[i] > 300000 & House$price[i] < 500000) {
'blue'
} else 'green'
}


Related Topics



Leave a reply



Submit