How to Apply a Function to Two Columns of Pandas Dataframe

Pandas : How to apply a function with multiple column inputs and where condition

First, you should only use apply if necessary. Vectorized functions will be much faster, and the way you have it written now in the np.where statement makes use of these. If you really want to make your code more readable (at the (probably small) expense of time and memory) you could make an intermediate column and then use it in the np.where statement.

df["Share"] = ( df.B + df.C ) / ( df.B + df.C + df.D )
df["X"] = ( df.A + df.Share * df.E ).where( df.index >= 2020 )

To answer your question, however, you can create a custom function and then apply it to your DataFrame.

def my_func( year,a,b,c,d,e ):
#This function can be longer and do more things
return np.nan if year < 2020 else a + ( ( (b + c) / (b + c + d) ) * e )


df['X'] = df.apply( lambda x: my_func( x.name, x.A, x.B, x.C, x.D, x.E ), axis = 1 )

Note that to access then index of a row when using apply with axis = 1 you need to use the name attribute.

Also, since applying a function is relatively slow, it may be worth creating columns that take care of some of the intermediate steps (such as summing several columns, etc.) so that that doesn't need to be done in each iteration.

Check out this answer for more examples of applying a custom function.

Apply function on multiple columns and create new column based on condition

I first had to add the columns and fill them with zeros, then apply the function.

def conditions(x,column1, column2):
if x[column1] != x[column2]:
return "incorrect"
else:
return "correct"


lst1=["col1","col2","col3","col4","col5"]
lst2=["col1_1","col2_2","col3_3","col4_4","col5_5"]
i=0
for item in lst2:
df[str(item)+"_2"] = 0

i=0
for item in df.columns[-5:]:
df[item]=df.apply(lambda x: conditions(x, column1=lst1[i], column2=lst2[i]) , axis=1)
i=i+1

How to apply lambda function on multiple columns using pandas

You're looking for .applymap:

prices[list_of_columns].applymap(lambda x: float(x))

Also, if you're really trying to just convert the values into floats, just use .astype:

prices[list_of_colums] = prices[list_of_columns].astype(float)

Apply function to two columns of a Pandas dataframe

Is it possible that you have a np.nan/None/null data in your columns? If so you might be getting an error similar to the one that is caused with this data

data = {
'Column1' : ['1', '2', np.nan, '3']
}
df = pd.DataFrame(data)
df['Column1'] = df['Column1'].apply(lambda x : x.lower())
df

Pandas apply function to each row by calculating multiple columns

IIUC, you can use:

out = (df
.groupby('name')
.apply(lambda g: g['amount'].mul(g['con']).sum()/g['amount'].sum())
)

output:

name
a 5.842105
b 4.571429
c 10.000000
dtype: float64

pandas apply function to multiple columns with condition and create new columns

First is necessary convert strings repr of lists by ast.literal_eval to lists, then for chceck length remove casting to strings. If need one element lists instead scalars use [] in fruit[0] and fruit[1] and last change order of condition for len(fruit) == 1, also change len(fruit) > 3 to len(fruit) > 2 for match first row:

def fruits_vegetable(row):

fruit = ast.literal_eval(row['fruit_code'])
vege = ast.literal_eval(row['vegetable_code'])

if len(fruit) == 1 and len(vege) > 1: # write "all" in new_col_1
row['new_col_1'] = 'all'
elif len(fruit) > 2 and len(vege) == 1: # vegetable_code in new_col_1
row['new_col_1'] = vege
elif len(fruit) > 2 and len(vege) > 1: # write "all" in new_col_1
row['new_col_1'] = 'all'
elif len(fruit) == 2 and len(vege) >= 0:# fruit 1 new_col_1 & fruit 2 new_col_2
row['new_col_1'] = [fruit[0]]
row['new_col_2'] = [fruit[1]]
elif len(fruit) == 1: # fruit_code in new_col_1
row['new_col_1'] = fruit
return row


df = df.apply(fruits_vegetable, axis=1)


print (df)
ID date fruit_code new_col_1 new_col_2 supermarket \
0 1 2022-01-01 [100,99,300] all NaN xy
1 2 2022-01-01 [67,200,87] [5000] NaN z, m
2 3 2021-01-01 [100,5,300,78] all NaN wf, z
3 4 2020-01-01 [77] [77] NaN NaN
4 5 2022-15-01 [100,200,546,33] all NaN t, wf
5 6 2002-12-01 [64,2] [64] [2] k
6 7 2018-12-01 [5] all NaN p

supermarkt vegetable_code
0 NaN [1000,2000,3000]
1 NaN [5000]
2 NaN [7000,2000,3000]
3 wf [1000]
4 NaN [4000,2000,3000]
5 NaN [6000,8000,1000]
6 NaN [6000,8000,1000]


Related Topics



Leave a reply



Submit