Dplyr Mutate Rowsums Calculations or Custom Functions

dplyr mutate rowSums calculations or custom functions

You can use rowwise() function:

iris %>% 
  rowwise() %>% 
  mutate(sumVar = sum(c_across(Sepal.Length:Petal.Width)))

#> # A tibble: 150 x 6
#> # Rowwise: 
#>    Sepal.Length Sepal.Width Petal.Length Petal.Width Species sumVar
#>           <dbl>       <dbl>        <dbl>       <dbl> <fct>    <dbl>
#>  1          5.1         3.5          1.4         0.2 setosa    10.2
#>  2          4.9         3            1.4         0.2 setosa     9.5
#>  3          4.7         3.2          1.3         0.2 setosa     9.4
#>  4          4.6         3.1          1.5         0.2 setosa     9.4
#>  5          5           3.6          1.4         0.2 setosa    10.2
#>  6          5.4         3.9          1.7         0.4 setosa    11.4
#>  7          4.6         3.4          1.4         0.3 setosa     9.7
#>  8          5           3.4          1.5         0.2 setosa    10.1
#>  9          4.4         2.9          1.4         0.2 setosa     8.9
#> 10          4.9         3.1          1.5         0.1 setosa     9.6
#> # ... with 140 more rows

"c_across() uses tidy selection syntax so you can to succinctly select many variables"'

Finally, if you want, you can use %>% ungroup at the end to exit from rowwise.

How do I use custom functions after the dplyr %% operator?

I've adjusted the function to include an argument for the data frame, added the necessary packages within the function definitions and converted the inputs to characters rather than numbers. This could also be added to the function definition if required.

library(dplyr)
library(stringr)

State <- function(df, x, y){
  dplyr::mutate(
    df,
    Account = stringr::str_remove_all(Account, "-"),
    Account = case_when(
      startsWith(Account, y) ~ stringr::str_c(stringr::str_c("WC", x), "Policy #"),
      startsWith(Account, x) ~ stringr::str_c("WC", "Policy #")
    )
  )
}

df %>% State("32", "90")

Custom function with dplyr mutate or summarise for different levels within a factor?

Your example code is most of the way there. You can do:

df1 %>% 
    mutate(Diff = newvar[gear == "3"] - newvar[gear == "5"])

Or:

df1 %>% 
    summarise(Diff = newvar[gear == "3"] - newvar[gear == "5"])

Logical subsetting still works in mutate() and summarise() calls like with any other vector.

Note that this works because after your summarise() call in your example code, df1 is still grouped by cyl, otherwise you would need to do a group_by() call to create the correct grouping.

Using mutate in custom function with mutation condition as argument

If your formula is always like origianl = do_something_original(), this may helps.(for dplyr version >= 1.0)

library(dplyr)
library(stringr)

update_mut <- function(df, mutation){
  xx <- word(mutation, 1)
  df %>% 
    mutate("{xx}" := eval(parse(text = mutation)))
}
update_mut(gapminder, "year = 2*year")

   country     continent  year lifeExp      pop gdpPercap
   <fct>       <fct>     <dbl>   <dbl>    <int>     <dbl>
 1 Afghanistan Asia       3904    28.8  8425333      779.
 2 Afghanistan Asia       3914    30.3  9240934      821.
 3 Afghanistan Asia       3924    32.0 10267083      853.
 4 Afghanistan Asia       3934    34.0 11537966      836.
 5 Afghanistan Asia       3944    36.1 13079460      740.
 6 Afghanistan Asia       3954    38.4 14880372      786.
 7 Afghanistan Asia       3964    39.9 12881816      978.
 8 Afghanistan Asia       3974    40.8 13867957      852.
 9 Afghanistan Asia       3984    41.7 16317921      649.
10 Afghanistan Asia       3994    41.8 22227415      635.

Conditional mutate in a custom function to change a character column in R

Try using the below function :

library(dplyr)

my_function <- function(date1, date2, variable, quota, monthly_business_days) {
  value <- deparse(substitute(variable))
  
  my_data %>%
    filter(between(DATE, ymd(date1), ymd(date2))) %>% 
    summarize(total = sum({{variable}})) %>%
    add_row(total = quota, .before = 1) %>% 
    rbind(.$total[[2]]/bizdays(date1, date2)*monthly_business_days) %>%
    mutate(indicator = if(value == 'UNITS') c("Quota (Units)", "Sales (Units)", "Forecast (Units)")
                       else c("Quota (USD)", "Sales (USD)", "Forecast (USD)"))
           
}

R mutate() with rowSums()

The difference in result might be due to the fact that part_langs is a grouped dataframe, as can be seen from the output of strshown in your post:

grouped_df [7 x 15] (S3: grouped_df/tbl_df/tbl/data.frame).

If this is the reason, then ungroup first and rerun your code:

library(dplyr)
part_langs <- part_langs %>% ungroup

Writing a custom function that works inside dplyr::mutate()

We can place the ... at the end

rowwise_sum <- function(data, na.rm = FALSE,...) {
  columns <- rlang::enquos(...)
  data %>%
     select(!!!columns) %>%
     rowSums(na.rm = na.rm)
}

cars %>% 
     mutate(sum = rowwise_sum(., na.rm = TRUE, speed, dist))
# A tibble: 50 x 3
#   speed  dist   sum
#   <dbl> <dbl> <dbl>
# 1     4     2     6
# 2     4    10    14
# 3     7     4    11
# 4     7    22    29
# 5     8    16    24
# 6     9    10    19
# 7    10    18    28
# 8    10    26    36
# 9    10    34    44
#10    11    17    28
# ... with 40 more rows

It would also work without changing the position of ... (though in general it is recommended). Here the main issue is the data (which is .) is not specified in the argument list within in mutate.

It would be easier to create the whole flow in the function instead of doing a part

rowwise_sum2 <- function(data, na.rm = FALSE, ...) {
  columns <- rlang::enquos(...)
  data %>%
      select(!!! columns) %>%
      transmute(sum = rowSums(., na.rm = TRUE)) %>%
      bind_cols(data, .)

}

rowwise_sum2(cars, na.rm = TRUE, speed, dist)
# A tibble: 50 x 3
#   speed  dist   sum
#   <dbl> <dbl> <dbl>
# 1     4     2     6
# 2     4    10    14
# 3     7     4    11
# 4     7    22    29
# 5     8    16    24
# 6     9    10    19
# 7    10    18    28
# 8    10    26    36
# 9    10    34    44
#10    11    17    28

How to use custom functions in mutate (dplyr)?

Your problem seems to be binom.test instead of dplyr, binom.test is not vectorized, so you can not expect it work on vectors; You can use mapply on the two columns with mutate:

table %>% 
    mutate(Ratio = mapply(function(x, y) binom.test.p(c(x,y)), 
                          ref_SG1_E2_1_R1_Sum, 
                          alt_SG1_E2_1_R1_Sum))

#  geneId ref_SG1_E2_1_R1_Sum alt_SG1_E2_1_R1_Sum Ratio
#1      a                  10                  10     1
#2      b                  20                  20     1
#3      c                  10                  10     1
#4      d                  15                  15     1

As for the last one, you need mutate_at instead of mutate:

table %>%
      mutate_at(.vars=c(2:3), .funs=funs(sum=sum(.)))

mutate with across, apply two functions in a row

You can supply custom functions as well as built-ins to across:

diamonds %>% 
   group_by(cut) %>% 
   summarise(across(x:z, function(x) round(mean(x))), .groups = 'drop')
# A tibble: 5 x 4
  cut           x     y     z
* <ord>     <dbl> <dbl> <dbl>
1 Fair          6     6     4
2 Good          6     6     4
3 Very Good     6     6     4
4 Premium       6     6     4
5 Ideal         6     6     3

R: Custom Function - Mutate Existing Column

I think you can achieve this more simply using with the following:

library(dplyr)

clean_func <- function(df){
    df %>% mutate(across(everything(), ~gsub(" & ", " and ", .) %>% 
                    gsub("[[:punct:]]$", "", .))) 
    }

df1 <- clean_func(df1)
df2 <- clean_func(df2)

You can make updates to the function by adding additional gsub, str_replace, or other calls as needed.

Edit:

Based on update, you can do something like this to target your variables specifically:

add_symbol <- function(col.name){
  gsub(" & ", " and ", col.name)
}

rm_trail_punc <- function(col.name){
  gsub("[[:punct:]]$", "", col.name)
}

standardise_col <- function(df, col.name){
  
    col.name <- enquo(col.name)
    
  df %>% 
    mutate(!!col.name := add_symbol(!!col.name),
           !!col.name := rm_trail_punc(!!col.name))
}

Your code won't ever work as written, but you could do something like this:

new_df <- standardise_col(df1, a) %>% 
left_join(., standardise_col(df2, c), by = c("a"="c"))

Which gives us:

# A tibble: 3 x 3
  a              b     d    
  <chr>          <chr> <chr>
1 apple and pear cat   car  
2 kiwi           dog   bike 
3 plum           cow   truck

You can read up on tidy evaluation here: https://tidyeval.tidyverse.org/dplyr.html

Dplyr Mutate Rowsums Calculations or Custom Functions