How to Use Custom Functions in Mutate (Dplyr)

mutate/transform in R dplyr (Pass custom function)

With transform your function has to operate on the vector. You can use ifelse instead, which works on vectors:

 isOdd <- function(x){ ifelse(x %% 2 == 0, "even", "odd") }

Alternatively you can apply the function to every value in the column with one of the apply functions:

 isOdd <- function(x){
sapply(x, function(x){
if(x %% 2 == 0){
return("even")
}else{
return("odd")
}
})}

How to use custom functions in mutate (dplyr)?

Your problem seems to be binom.test instead of dplyr, binom.test is not vectorized, so you can not expect it work on vectors; You can use mapply on the two columns with mutate:

table %>% 
mutate(Ratio = mapply(function(x, y) binom.test.p(c(x,y)),
ref_SG1_E2_1_R1_Sum,
alt_SG1_E2_1_R1_Sum))

# geneId ref_SG1_E2_1_R1_Sum alt_SG1_E2_1_R1_Sum Ratio
#1 a 10 10 1
#2 b 20 20 1
#3 c 10 10 1
#4 d 15 15 1

As for the last one, you need mutate_at instead of mutate:

table %>%
mutate_at(.vars=c(2:3), .funs=funs(sum=sum(.)))

Calling user defined functions from dplyr::mutate

The function does not know which object you want to modify. Pass the period object in the function and use it like :

period_to_date <- function(period) {
lubridate::ymd(stringr::str_c(period, "01"))
#Can also use
#as.Date(paste0(period,"01"), "%Y%m%d")
}

tibble_1 %>%
dplyr::mutate(date = period_to_date(period))

# period var_1 var_2 date
# <dbl> <dbl> <dbl> <date>
#1 201901 -0.476 -0.456 2019-01-01
#2 201912 -0.645 1.45 2019-12-01
#3 201902 -0.0939 -0.982 2019-02-01
#4 201903 0.410 0.954 2019-03-01

How do I use custom functions after the dplyr %% operator?

I've adjusted the function to include an argument for the data frame, added the necessary packages within the function definitions and converted the inputs to characters rather than numbers. This could also be added to the function definition if required.

library(dplyr)
library(stringr)

State <- function(df, x, y){
dplyr::mutate(
df,
Account = stringr::str_remove_all(Account, "-"),
Account = case_when(
startsWith(Account, y) ~ stringr::str_c(stringr::str_c("WC", x), "Policy #"),
startsWith(Account, x) ~ stringr::str_c("WC", "Policy #")
)
)
}

df %>% State("32", "90")

dplyr mutate apply a custom function

You can do a group_by() on 'group' and then apply your function to the subgroups with do(). This replicates the 'RTfiltered' you provided in the example, is this what you are looking for?

df1 %>% 
group_by(group) %>%
do(mutate(., effects = correctOrderEffects(.)))

How to use mutate across with a custom function with multiple arguments

The (quasi-)function(s) in across(..., ***) iterate over vectors, so they never see the whole frame. I suggest you modified your function to deal with vectors, not frames.

my_func2 <- function(x, prop) replace(x, sample(length(x), size = ceiling(prop * length(x)), replace = FALSE), NA)
set.seed(42)
out <- mtcars %>%
mutate(across(1:3, ~ my_func2(., 0.3)))
out
# mpg cyl disp hp drat wt qsec vs am gear carb
# Mazda RX4 NA 6 160.0 110 3.90 2.620 16.46 0 1 4 4
# Mazda RX4 Wag 21.0 6 NA 110 3.90 2.875 17.02 0 1 4 4
# Datsun 710 22.8 NA NA 93 3.85 2.320 18.61 1 1 4 1
# Hornet 4 Drive NA NA NA 110 3.08 3.215 19.44 1 0 3 1
# Hornet Sportabout NA NA NA 175 3.15 3.440 17.02 0 0 3 2
# Valiant 18.1 6 225.0 105 2.76 3.460 20.22 1 0 3 1
# Duster 360 NA 8 360.0 245 3.21 3.570 15.84 0 0 3 4
# Merc 240D 24.4 4 NA 62 3.69 3.190 20.00 1 0 4 2
# Merc 230 22.8 NA 140.8 95 3.92 3.150 22.90 1 0 4 2
# Merc 280 NA 6 167.6 123 3.92 3.440 18.30 1 0 4 4
# Merc 280C 17.8 6 167.6 123 3.92 3.440 18.90 1 0 4 4
# Merc 450SE 16.4 8 275.8 180 3.07 4.070 17.40 0 0 3 3
# Merc 450SL 17.3 8 NA 180 3.07 3.730 17.60 0 0 3 3
# Merc 450SLC 15.2 NA 275.8 180 3.07 3.780 18.00 0 0 3 3
# Cadillac Fleetwood NA NA 472.0 205 2.93 5.250 17.98 0 0 3 4
# Lincoln Continental 10.4 8 460.0 215 3.00 5.424 17.82 0 0 3 4
# Chrysler Imperial NA 8 440.0 230 3.23 5.345 17.42 0 0 3 4
# Fiat 128 NA NA 78.7 66 4.08 2.200 19.47 1 1 4 1
# Honda Civic 30.4 4 75.7 52 4.93 1.615 18.52 1 1 4 2
# Toyota Corolla 33.9 NA NA 65 4.22 1.835 19.90 1 1 4 1
# Toyota Corona 21.5 4 120.1 97 3.70 2.465 20.01 1 0 3 1
# Dodge Challenger 15.5 8 318.0 150 2.76 3.520 16.87 0 0 3 2
# AMC Javelin 15.2 8 304.0 150 3.15 3.435 17.30 0 0 3 2
# Camaro Z28 13.3 8 350.0 245 3.73 3.840 15.41 0 0 3 4
# Pontiac Firebird NA NA NA 175 3.08 3.845 17.05 0 0 3 2
# Fiat X1-9 27.3 NA 79.0 66 4.08 1.935 18.90 1 1 4 1
# Porsche 914-2 26.0 4 NA 91 4.43 2.140 16.70 0 1 5 2
# Lotus Europa 30.4 4 95.1 113 3.77 1.513 16.90 1 1 5 2
# Ford Pantera L 15.8 8 NA 264 4.22 3.170 14.50 0 1 5 4
# Ferrari Dino 19.7 6 145.0 175 3.62 2.770 15.50 0 1 5 6
# Maserati Bora 15.0 8 301.0 335 3.54 3.570 14.60 0 1 5 8
# Volvo 142E NA 4 121.0 109 4.11 2.780 18.60 1 1 4 2

sapply(out, function(z) sum(is.na(z)) / length(z))
# mpg cyl disp hp drat wt qsec vs am gear carb
# 0.3125 0.3125 0.3125 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000

Using mutate in custom function with mutation condition as argument

If your formula is always like origianl = do_something_original(), this may helps.(for dplyr version >= 1.0)

library(dplyr)
library(stringr)

update_mut <- function(df, mutation){
xx <- word(mutation, 1)
df %>%
mutate("{xx}" := eval(parse(text = mutation)))
}
update_mut(gapminder, "year = 2*year")

country continent year lifeExp pop gdpPercap
<fct> <fct> <dbl> <dbl> <int> <dbl>
1 Afghanistan Asia 3904 28.8 8425333 779.
2 Afghanistan Asia 3914 30.3 9240934 821.
3 Afghanistan Asia 3924 32.0 10267083 853.
4 Afghanistan Asia 3934 34.0 11537966 836.
5 Afghanistan Asia 3944 36.1 13079460 740.
6 Afghanistan Asia 3954 38.4 14880372 786.
7 Afghanistan Asia 3964 39.9 12881816 978.
8 Afghanistan Asia 3974 40.8 13867957 852.
9 Afghanistan Asia 3984 41.7 16317921 649.
10 Afghanistan Asia 3994 41.8 22227415 635.

Writing a custom function that works inside dplyr::mutate()

We can place the ... at the end

rowwise_sum <- function(data, na.rm = FALSE,...) {
columns <- rlang::enquos(...)
data %>%
select(!!!columns) %>%
rowSums(na.rm = na.rm)
}

cars %>%
mutate(sum = rowwise_sum(., na.rm = TRUE, speed, dist))
# A tibble: 50 x 3
# speed dist sum
# <dbl> <dbl> <dbl>
# 1 4 2 6
# 2 4 10 14
# 3 7 4 11
# 4 7 22 29
# 5 8 16 24
# 6 9 10 19
# 7 10 18 28
# 8 10 26 36
# 9 10 34 44
#10 11 17 28
# ... with 40 more rows

It would also work without changing the position of ... (though in general it is recommended). Here the main issue is the data (which is .) is not specified in the argument list within in mutate.


It would be easier to create the whole flow in the function instead of doing a part

rowwise_sum2 <- function(data, na.rm = FALSE, ...) {
columns <- rlang::enquos(...)
data %>%
select(!!! columns) %>%
transmute(sum = rowSums(., na.rm = TRUE)) %>%
bind_cols(data, .)

}

rowwise_sum2(cars, na.rm = TRUE, speed, dist)
# A tibble: 50 x 3
# speed dist sum
# <dbl> <dbl> <dbl>
# 1 4 2 6
# 2 4 10 14
# 3 7 4 11
# 4 7 22 29
# 5 8 16 24
# 6 9 10 19
# 7 10 18 28
# 8 10 26 36
# 9 10 34 44
#10 11 17 28


Related Topics



Leave a reply



Submit