Dplyr Mutate Rowsums Calculations or Custom Functions

dplyr mutate rowSums calculations or custom functions

You can use rowwise() function:

iris %>% 
rowwise() %>%
mutate(sumVar = sum(c_across(Sepal.Length:Petal.Width)))

#> # A tibble: 150 x 6
#> # Rowwise:
#> Sepal.Length Sepal.Width Petal.Length Petal.Width Species sumVar
#> <dbl> <dbl> <dbl> <dbl> <fct> <dbl>
#> 1 5.1 3.5 1.4 0.2 setosa 10.2
#> 2 4.9 3 1.4 0.2 setosa 9.5
#> 3 4.7 3.2 1.3 0.2 setosa 9.4
#> 4 4.6 3.1 1.5 0.2 setosa 9.4
#> 5 5 3.6 1.4 0.2 setosa 10.2
#> 6 5.4 3.9 1.7 0.4 setosa 11.4
#> 7 4.6 3.4 1.4 0.3 setosa 9.7
#> 8 5 3.4 1.5 0.2 setosa 10.1
#> 9 4.4 2.9 1.4 0.2 setosa 8.9
#> 10 4.9 3.1 1.5 0.1 setosa 9.6
#> # ... with 140 more rows

"c_across() uses tidy selection syntax so you can to succinctly select many variables"'

Finally, if you want, you can use %>% ungroup at the end to exit from rowwise.

How do I use custom functions after the dplyr %% operator?

I've adjusted the function to include an argument for the data frame, added the necessary packages within the function definitions and converted the inputs to characters rather than numbers. This could also be added to the function definition if required.

library(dplyr)
library(stringr)

State <- function(df, x, y){
dplyr::mutate(
df,
Account = stringr::str_remove_all(Account, "-"),
Account = case_when(
startsWith(Account, y) ~ stringr::str_c(stringr::str_c("WC", x), "Policy #"),
startsWith(Account, x) ~ stringr::str_c("WC", "Policy #")
)
)
}

df %>% State("32", "90")

Custom function with dplyr mutate or summarise for different levels within a factor?

Your example code is most of the way there. You can do:

df1 %>% 
mutate(Diff = newvar[gear == "3"] - newvar[gear == "5"])

Or:

df1 %>% 
summarise(Diff = newvar[gear == "3"] - newvar[gear == "5"])

Logical subsetting still works in mutate() and summarise() calls like with any other vector.

Note that this works because after your summarise() call in your example code, df1 is still grouped by cyl, otherwise you would need to do a group_by() call to create the correct grouping.

Using mutate in custom function with mutation condition as argument

If your formula is always like origianl = do_something_original(), this may helps.(for dplyr version >= 1.0)

library(dplyr)
library(stringr)

update_mut <- function(df, mutation){
xx <- word(mutation, 1)
df %>%
mutate("{xx}" := eval(parse(text = mutation)))
}
update_mut(gapminder, "year = 2*year")

country continent year lifeExp pop gdpPercap
<fct> <fct> <dbl> <dbl> <int> <dbl>
1 Afghanistan Asia 3904 28.8 8425333 779.
2 Afghanistan Asia 3914 30.3 9240934 821.
3 Afghanistan Asia 3924 32.0 10267083 853.
4 Afghanistan Asia 3934 34.0 11537966 836.
5 Afghanistan Asia 3944 36.1 13079460 740.
6 Afghanistan Asia 3954 38.4 14880372 786.
7 Afghanistan Asia 3964 39.9 12881816 978.
8 Afghanistan Asia 3974 40.8 13867957 852.
9 Afghanistan Asia 3984 41.7 16317921 649.
10 Afghanistan Asia 3994 41.8 22227415 635.

Conditional mutate in a custom function to change a character column in R

Try using the below function :

library(dplyr)

my_function <- function(date1, date2, variable, quota, monthly_business_days) {
value <- deparse(substitute(variable))

my_data %>%
filter(between(DATE, ymd(date1), ymd(date2))) %>%
summarize(total = sum({{variable}})) %>%
add_row(total = quota, .before = 1) %>%
rbind(.$total[[2]]/bizdays(date1, date2)*monthly_business_days) %>%
mutate(indicator = if(value == 'UNITS') c("Quota (Units)", "Sales (Units)", "Forecast (Units)")
else c("Quota (USD)", "Sales (USD)", "Forecast (USD)"))

}

R mutate() with rowSums()

The difference in result might be due to the fact that part_langs is a grouped dataframe, as can be seen from the output of strshown in your post:

grouped_df [7 x 15] (S3: grouped_df/tbl_df/tbl/data.frame). 

If this is the reason, then ungroup first and rerun your code:

library(dplyr)
part_langs <- part_langs %>% ungroup

Writing a custom function that works inside dplyr::mutate()

We can place the ... at the end

rowwise_sum <- function(data, na.rm = FALSE,...) {
columns <- rlang::enquos(...)
data %>%
select(!!!columns) %>%
rowSums(na.rm = na.rm)
}

cars %>%
mutate(sum = rowwise_sum(., na.rm = TRUE, speed, dist))
# A tibble: 50 x 3
# speed dist sum
# <dbl> <dbl> <dbl>
# 1 4 2 6
# 2 4 10 14
# 3 7 4 11
# 4 7 22 29
# 5 8 16 24
# 6 9 10 19
# 7 10 18 28
# 8 10 26 36
# 9 10 34 44
#10 11 17 28
# ... with 40 more rows

It would also work without changing the position of ... (though in general it is recommended). Here the main issue is the data (which is .) is not specified in the argument list within in mutate.


It would be easier to create the whole flow in the function instead of doing a part

rowwise_sum2 <- function(data, na.rm = FALSE, ...) {
columns <- rlang::enquos(...)
data %>%
select(!!! columns) %>%
transmute(sum = rowSums(., na.rm = TRUE)) %>%
bind_cols(data, .)

}

rowwise_sum2(cars, na.rm = TRUE, speed, dist)
# A tibble: 50 x 3
# speed dist sum
# <dbl> <dbl> <dbl>
# 1 4 2 6
# 2 4 10 14
# 3 7 4 11
# 4 7 22 29
# 5 8 16 24
# 6 9 10 19
# 7 10 18 28
# 8 10 26 36
# 9 10 34 44
#10 11 17 28

How to use custom functions in mutate (dplyr)?

Your problem seems to be binom.test instead of dplyr, binom.test is not vectorized, so you can not expect it work on vectors; You can use mapply on the two columns with mutate:

table %>% 
mutate(Ratio = mapply(function(x, y) binom.test.p(c(x,y)),
ref_SG1_E2_1_R1_Sum,
alt_SG1_E2_1_R1_Sum))

# geneId ref_SG1_E2_1_R1_Sum alt_SG1_E2_1_R1_Sum Ratio
#1 a 10 10 1
#2 b 20 20 1
#3 c 10 10 1
#4 d 15 15 1

As for the last one, you need mutate_at instead of mutate:

table %>%
mutate_at(.vars=c(2:3), .funs=funs(sum=sum(.)))

mutate with across, apply two functions in a row

You can supply custom functions as well as built-ins to across:

diamonds %>% 
group_by(cut) %>%
summarise(across(x:z, function(x) round(mean(x))), .groups = 'drop')
# A tibble: 5 x 4
cut x y z
* <ord> <dbl> <dbl> <dbl>
1 Fair 6 6 4
2 Good 6 6 4
3 Very Good 6 6 4
4 Premium 6 6 4
5 Ideal 6 6 3

R: Custom Function - Mutate Existing Column

I think you can achieve this more simply using with the following:

library(dplyr)

clean_func <- function(df){
df %>% mutate(across(everything(), ~gsub(" & ", " and ", .) %>%
gsub("[[:punct:]]$", "", .)))
}

df1 <- clean_func(df1)
df2 <- clean_func(df2)

You can make updates to the function by adding additional gsub, str_replace, or other calls as needed.

Edit:

Based on update, you can do something like this to target your variables specifically:

add_symbol <- function(col.name){
gsub(" & ", " and ", col.name)
}

rm_trail_punc <- function(col.name){
gsub("[[:punct:]]$", "", col.name)
}

standardise_col <- function(df, col.name){

col.name <- enquo(col.name)

df %>%
mutate(!!col.name := add_symbol(!!col.name),
!!col.name := rm_trail_punc(!!col.name))
}

Your code won't ever work as written, but you could do something like this:

new_df <- standardise_col(df1, a) %>% 
left_join(., standardise_col(df2, c), by = c("a"="c"))

Which gives us:

# A tibble: 3 x 3
a b d
<chr> <chr> <chr>
1 apple and pear cat car
2 kiwi dog bike
3 plum cow truck

You can read up on tidy evaluation here: https://tidyeval.tidyverse.org/dplyr.html



Related Topics



Leave a reply



Submit