Pandas : How to apply a function with multiple column inputs and where condition
First, you should only use apply if necessary. Vectorized functions will be much faster, and the way you have it written now in the np.where statement makes use of these. If you really want to make your code more readable (at the (probably small) expense of time and memory) you could make an intermediate column and then use it in the np.where statement.
df["Share"] = ( df.B + df.C ) / ( df.B + df.C + df.D )
df["X"] = ( df.A + df.Share * df.E ).where( df.index >= 2020 )
To answer your question, however, you can create a custom function and then apply it to your DataFrame.
def my_func( year,a,b,c,d,e ):
#This function can be longer and do more things
return np.nan if year < 2020 else a + ( ( (b + c) / (b + c + d) ) * e )
df['X'] = df.apply( lambda x: my_func( x.name, x.A, x.B, x.C, x.D, x.E ), axis = 1 )
Note that to access then index of a row when using apply with axis = 1
you need to use the name attribute.
Also, since applying a function is relatively slow, it may be worth creating columns that take care of some of the intermediate steps (such as summing several columns, etc.) so that that doesn't need to be done in each iteration.
Check out this answer for more examples of applying a custom function.
Apply function on multiple columns and create new column based on condition
I first had to add the columns and fill them with zeros, then apply the function.
def conditions(x,column1, column2):
if x[column1] != x[column2]:
return "incorrect"
else:
return "correct"
lst1=["col1","col2","col3","col4","col5"]
lst2=["col1_1","col2_2","col3_3","col4_4","col5_5"]
i=0
for item in lst2:
df[str(item)+"_2"] = 0
i=0
for item in df.columns[-5:]:
df[item]=df.apply(lambda x: conditions(x, column1=lst1[i], column2=lst2[i]) , axis=1)
i=i+1
How to apply lambda function on multiple columns using pandas
You're looking for .applymap
:
prices[list_of_columns].applymap(lambda x: float(x))
Also, if you're really trying to just convert the values into floats, just use .astype
:
prices[list_of_colums] = prices[list_of_columns].astype(float)
Apply function to two columns of a Pandas dataframe
Is it possible that you have a np.nan/None/null data in your columns? If so you might be getting an error similar to the one that is caused with this data
data = {
'Column1' : ['1', '2', np.nan, '3']
}
df = pd.DataFrame(data)
df['Column1'] = df['Column1'].apply(lambda x : x.lower())
df
Pandas apply function to each row by calculating multiple columns
IIUC, you can use:
out = (df
.groupby('name')
.apply(lambda g: g['amount'].mul(g['con']).sum()/g['amount'].sum())
)
output:
name
a 5.842105
b 4.571429
c 10.000000
dtype: float64
pandas apply function to multiple columns with condition and create new columns
First is necessary convert strings repr of lists by ast.literal_eval
to lists, then for chceck length remove casting to strings. If need one element lists instead scalars use []
in fruit[0]
and fruit[1]
and last change order of condition for len(fruit) == 1
, also change len(fruit) > 3
to len(fruit) > 2
for match first row:
def fruits_vegetable(row):
fruit = ast.literal_eval(row['fruit_code'])
vege = ast.literal_eval(row['vegetable_code'])
if len(fruit) == 1 and len(vege) > 1: # write "all" in new_col_1
row['new_col_1'] = 'all'
elif len(fruit) > 2 and len(vege) == 1: # vegetable_code in new_col_1
row['new_col_1'] = vege
elif len(fruit) > 2 and len(vege) > 1: # write "all" in new_col_1
row['new_col_1'] = 'all'
elif len(fruit) == 2 and len(vege) >= 0:# fruit 1 new_col_1 & fruit 2 new_col_2
row['new_col_1'] = [fruit[0]]
row['new_col_2'] = [fruit[1]]
elif len(fruit) == 1: # fruit_code in new_col_1
row['new_col_1'] = fruit
return row
df = df.apply(fruits_vegetable, axis=1)
print (df)
ID date fruit_code new_col_1 new_col_2 supermarket \
0 1 2022-01-01 [100,99,300] all NaN xy
1 2 2022-01-01 [67,200,87] [5000] NaN z, m
2 3 2021-01-01 [100,5,300,78] all NaN wf, z
3 4 2020-01-01 [77] [77] NaN NaN
4 5 2022-15-01 [100,200,546,33] all NaN t, wf
5 6 2002-12-01 [64,2] [64] [2] k
6 7 2018-12-01 [5] all NaN p
supermarkt vegetable_code
0 NaN [1000,2000,3000]
1 NaN [5000]
2 NaN [7000,2000,3000]
3 wf [1000]
4 NaN [4000,2000,3000]
5 NaN [6000,8000,1000]
6 NaN [6000,8000,1000]
Related Topics
Can a Variable Number of Arguments Be Passed to a Function
Prevent Scientific Notation in Matplotlib.Pyplot
How to Uninstall Python 2.7 on a MAC Os X 10.6.4
How to Properly Determine the Current Script Directory
Unicodeencodeerror: 'Charmap' Codec Can't Encode Characters
Multiple Assignment and Evaluation Order in Python
"Pip Install Unroll": "Python Setup.Py Egg_Info" Failed With Error Code 1
How to Print a Single Backslash
Using @Property Versus Getters and Setters
How to Wait Some Time in Pygame
Unicodedecodeerror When Reading CSV File in Pandas With Python
Changing Default Encoding of Python
Parsing Xml With Namespace in Python Via 'Elementtree'
How to Create a List of Random Numbers Without Duplicates
"Unicode Error "Unicodeescape" Codec Can't Decode Bytes... Cannot Open Text Files in Python 3