Apply Function Conditionally

Apply conditional function to a dataframe

You don't need to use an apply() function here, you can just use ifelse():

df$output <- ifelse(df$var3 > df$var1, df$var2*df$var4, df$var2)

Apply function conditionally

There are a lot of alternatives to do this. Note that if you are interested in another function different from sum, then just change the argument FUN=any.function, e.g, if you want mean, var length, etc, then just plug those functions into FUN argument, e.g, FUN=mean, FUN=var and so on. Let's explore some alternatives:

aggregate function in base.

> aggregate(results ~ experiment, FUN=sum, data=DF)
  experiment results
1          A    86.3
2          B   986.0

Or maybe tapply ?

> with(DF, tapply(results, experiment, FUN=sum))
    A     B 
 86.3 986.0

Also ddply from plyr package

> # library(plyr)
> ddply(DF[, -2], .(experiment), numcolwise(sum))
  experiment results
1          A    86.3
2          B   986.0

> ## Alternative syntax
> ddply(DF, .(experiment), summarize, sumResults = sum(results))
  experiment sumResults
1          A       86.3
2          B      986.0

Also the dplyr package

> require(dplyr)
> DF %>% group_by(experiment) %>% summarise(sumResults = sum(results))
Source: local data frame [2 x 2]

  experiment  sumResults
1          A        86.3
2          B       986.0

Using sapply and split, equivalent to tapply.

> with(DF, sapply(split(results, experiment), sum))
    A     B 
 86.3 986.0

If you are concern about timing, data.table is your friend:

> # library(data.table)
> DT <- data.table(DF)
> DT[, sum(results), by=experiment]
   experiment    V1
1:          A  86.3
2:          B 986.0

Not so popular, but doBy package is nice (equivalent to aggregate, even in syntax!)

> # library(doBy)
> summaryBy(results~experiment, FUN=sum, data=DF)
  experiment results.sum
1          A        86.3
2          B       986.0

Also by helps in this situation

> (Aggregate.sums <- with(DF, by(results, experiment, sum)))
experiment: A
[1] 86.3
------------------------------------------------------------------------- 
experiment: B
[1] 986

If you want the result to be a matrix then use either cbind or rbind

> cbind(results=Aggregate.sums)
  results
A    86.3
B   986.0

sqldf from sqldf package also could be a good option

> library(sqldf)
> sqldf("select experiment, sum(results) `sum.results`
      from DF group by experiment")
  experiment sum.results
1          A        86.3
2          B       986.0

xtabs also works (only when FUN=sum)

> xtabs(results ~ experiment, data=DF)
experiment
    A     B 
 86.3 986.0

Apply a conditional function in a nested dataframe

We filter the data and then use map to loop over the list 'data'

library(dplyr)
library(purrr)
library(ggplot2)

df2 <- df %>%
   filter(manufacturer %in% manufacturers_vector) %>% 
   mutate(out = map(data,  ~ func(.x$drv, .x$cty)))

-output

df2
# A tibble: 3 x 3
# Groups:   manufacturer [3]
#  manufacturer data               out       
#  <chr>        <list>             <list>    
#1 audi         <tibble [18 × 10]> <dbl [18]>
#2 chevrolet    <tibble [19 × 10]> <dbl [19]>
#3 jeep         <tibble [8 × 10]>  <dbl [8]>

-out column output

df2$out
#[[1]]
# [1]  0 21 20 21  0  0  0 18 16 20 19 15 17 17 15 15 17 16

#[[2]]
# [1] 14 11 14 13 12 16 15 16 15 15 14 11 11 14  0 22  0  0  0

#[[3]]
#[1] 17 15 15 14  9 14 13 11

If we want to keep the original data as such without filter, then use map_if

df %>% 
  mutate(out = map_if(data, .f = ~ func(.x$drv, .x$cty),
     .p = manufacturer %in% manufacturers_vector, .else = ~ NA_real_))

-output

# A tibble: 15 x 3
# Groups:   manufacturer [15]
#   manufacturer data               out       
#   <chr>        <list>             <list>    
# 1 audi         <tibble [18 × 10]> <dbl [18]>
# 2 chevrolet    <tibble [19 × 10]> <dbl [19]>
# 3 dodge        <tibble [37 × 10]> <dbl [1]> 
# 4 ford         <tibble [25 × 10]> <dbl [1]> 
# 5 honda        <tibble [9 × 10]>  <dbl [1]> 
# 6 hyundai      <tibble [14 × 10]> <dbl [1]> 
# 7 jeep         <tibble [8 × 10]>  <dbl [8]> 
# 8 land rover   <tibble [4 × 10]>  <dbl [1]> 
# 9 lincoln      <tibble [3 × 10]>  <dbl [1]> 
#10 mercury      <tibble [4 × 10]>  <dbl [1]> 
#11 nissan       <tibble [13 × 10]> <dbl [1]> 
#12 pontiac      <tibble [5 × 10]>  <dbl [1]> 
#13 subaru       <tibble [14 × 10]> <dbl [1]> 
#14 toyota       <tibble [34 × 10]> <dbl [1]> 
#15 volkswagen   <tibble [27 × 10]> <dbl [1]>

How to conditionally use `pandas.DataFrame.apply` based on values in a certain column?

Filter your dataframe first then apply my_func. Let's use query:

df1['new_column'] = df1.query('type == "A"').apply(my_func, axis=1)

Output:

   amount      back       file     front type  \
0       3  21973805  filename2  21889611    A   
1       4  36403870  filename2  36357723    A   
2       5    277500  filename3    196312    A   
3       1        19  filename4        11    B   
4       2       120  filename4        42    B   
5       1      3210  filename3      1992    C   

                                 new_column  
0            [21921030, 21908574, 21971743]  
1  [36391053, 36371413, 36394390, 36376405]  
2  [198648, 263355, 197017, 261666, 260815]  
3                                       NaN  
4                                       NaN  
5                                       NaN

Conditional apply() in r

I think you're making it too complicated. Just calculate for all then remove those you don't want:

DT$xp_ratio_y <- DT$driv_y_experience/DT$driv_y_age
DT$xp_ratio_y[DT$driv_y_add_flg !=1 ] <- 0

Pandas apply but only for rows where a condition is met

The other answers are excellent, but I thought I'd add one other approach that can be faster in some circumstances – using broadcasting and masking to achieve the same result:

import numpy as np

mask = (z['b'] != 0)
z_valid = z[mask]

z['c'] = 0
z.loc[mask, 'c'] = z_valid['a'] / np.log(z_valid['b'])

Especially with very large dataframes, this approach will generally be faster than solutions based on apply().

Pandas .apply with conditional if in different columns

Use below code-

df['Testing']=df.apply(lambda x: 1 if x['Liq_Factor']=='Nan'  else x['Use']/x['Tw'], axis=1)

Based on changes in comment section

df['Testing']=df.apply(lambda x: 1 if x['Liq_Factor']=='Nan'  else min(x['Use']/x['Tw'],1), axis=1)

Use an 'apply' function to perform code with conditional statements in R

Sure you can! I would first define a helper function that defines what is to be done with one specific column and then you call that function within apply:

    HelperFun <- function(x) {
    # your code from above, replacing 'Seq1' by x
    }
    apply(First, 2, HelperFun)