Dplyr - Mutate Dynamically Named Variables Using Other Dynamically Named Variables

Use dynamic name for new column/variable in `dplyr`

Since you are dynamically building a variable name as a character value, it makes more sense to do assignment using standard data.frame indexing which allows for character values for column names. For example:

multipetal <- function(df, n) {
varname <- paste("petal", n , sep=".")
df[[varname]] <- with(df, Petal.Width * n)
df
}

The mutate function makes it very easy to name new columns via named parameters. But that assumes you know the name when you type the command. If you want to dynamically specify the column name, then you need to also build the named argument.



dplyr version >= 1.0

With the latest dplyr version you can use the syntax from the glue package when naming parameters when using :=. So here the {} in the name grab the value by evaluating the expression inside.

multipetal <- function(df, n) {
mutate(df, "petal.{n}" := Petal.Width * n)
}

If you are passing a column name to your function, you can use {{}} in the string as well as for the column name

meanofcol <- function(df, col) {
mutate(df, "Mean of {{col}}" := mean({{col}}))
}
meanofcol(iris, Petal.Width)



dplyr version >= 0.7

dplyr starting with version 0.7 allows you to use := to dynamically assign parameter names. You can write your function as:

# --- dplyr version 0.7+---
multipetal <- function(df, n) {
varname <- paste("petal", n , sep=".")
mutate(df, !!varname := Petal.Width * n)
}

For more information, see the documentation available form vignette("programming", "dplyr").



dplyr (>=0.3 & <0.7)

Slightly earlier version of dplyr (>=0.3 <0.7), encouraged the use of "standard evaluation" alternatives to many of the functions. See the Non-standard evaluation vignette for more information (vignette("nse")).

So here, the answer is to use mutate_() rather than mutate() and do:

# --- dplyr version 0.3-0.5---
multipetal <- function(df, n) {
varname <- paste("petal", n , sep=".")
varval <- lazyeval::interp(~Petal.Width * n, n=n)
mutate_(df, .dots= setNames(list(varval), varname))
}


dplyr < 0.3

Note this is also possible in older versions of dplyr that existed when the question was originally posed. It requires careful use of quote and setName:

# --- dplyr versions < 0.3 ---
multipetal <- function(df, n) {
varname <- paste("petal", n , sep=".")
pp <- c(quote(df), setNames(list(quote(Petal.Width * n)), varname))
do.call("mutate", pp)
}

Dplyr - Mutate dynamically named variables using other dynamically named variables

Here, we don't need the enquo/quo_name for 'year' as we are passing a numeric value. The output of paste will be character class, using sym from rlang (as @joran mentioned) this can be converted to symbol and evaluated with !!. Make sure to add braces around the '!! calc1_header' and '!! calc2_header' to evaluate the specific object

my_fun <- function(df, year) {

total_header <- paste("total", year, sep = "_")
calc1_header <- rlang::sym(paste("value1", year, sep = "_"))
calc2_header <- rlang::sym(paste("value2", year, sep = "_"))

df %>%
mutate(!!total_header := multiplier * (!!calc1_header) + (!!calc2_header))

}

my_fun(df1, 2016)
# ID multiplier value1_2015 value2_2015 value1_2016 value2_2016 total_2016
#1 1 0.5 2 3 1 4 4.5
#2 2 1.0 2 4 4 5 9.0

Mutate a dynamic column name with conditions using other dynamic column names

use get to retreive column value instead

library(tidyverse)

d <- mtcars %>% tibble
fld_name <- "mpg"
other_fld_name <- "cyl"

d %>% mutate(!!fld_name := ifelse(get(other_fld_name) < 5 ,NA, get(fld_name)))

#> # A tibble: 32 x 11
#> mpg cyl disp hp drat wt qsec vs am gear carb
#> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 21 6 160 110 3.9 2.62 16.5 0 1 4 4
#> 2 21 6 160 110 3.9 2.88 17.0 0 1 4 4
#> 3 NA 4 108 93 3.85 2.32 18.6 1 1 4 1
#> 4 21.4 6 258 110 3.08 3.22 19.4 1 0 3 1
#> 5 18.7 8 360 175 3.15 3.44 17.0 0 0 3 2
#> 6 18.1 6 225 105 2.76 3.46 20.2 1 0 3 1
#> 7 14.3 8 360 245 3.21 3.57 15.8 0 0 3 4
#> 8 NA 4 147. 62 3.69 3.19 20 1 0 4 2
#> 9 NA 4 141. 95 3.92 3.15 22.9 1 0 4 2
#> 10 19.2 6 168. 123 3.92 3.44 18.3 1 0 4 4
#> # ... with 22 more rows

Created on 2021-06-22 by the reprex package (v2.0.0)

Dynamic variable names to mutate variables in for-loop

As we are passing string, convert to symbol and evaluate (!!)

func <- function(i) {

mutate(df1, !!i := case_when(!is.na(!! rlang::ensym(i)) ~ as.character(!! rlang::ensym(i)),
is.na(!!rlang::ensym(i)) & var0 != '1' ~ '4444',
TRUE ~ '0'))
}

-testing

for(i in vars) {
df1 <- func(i)
}
df1
var0 var1 var2 var3
1 1 0 1 NA
2 2 1 4444 1
3 2 0 0 0
4 1 1 0 4444
5 1 0 1 1
6 2 4444 4444 NA
7 2 4444 4444 1

We may do this with across as well

df1 %>%
mutate(across(all_of(vars),
~ case_when(!is.na(.) ~ as.character(.),
is.na(.) & var0 != '1' ~ '4444', TRUE ~ '0')))
var0 var1 var2 var3
1 1 0 1 NA
2 2 1 4444 1
3 2 0 0 0
4 1 1 0 4444
5 1 0 1 1
6 2 4444 4444 NA
7 2 4444 4444 1

R mutate across and using two dynamically named columns to calculate result

  1. You're missing the ~ to mark the ifelse(..) as a function of sorts.
  2. cur_col() not found (for me), should likely be . or .x
  3. You are str_detecting in the name of the _Kenn-equivalent column, not the values in that column; we need to add cur_data()[[..]] as well.

I tend to not use stringr for straight-forward replacements like this, preferring base R:

library(dplyr)
Test %>%
mutate(
across(
paste0(Param, "_Konz"),
~ if_else( grepl("[XF]", cur_data()[[ gsub("_Konz", "_Kenn", cur_column()) ]] ),
.[NA], . )
)
)
# # A tibble: 6 x 5
# Date HCl_Konz HCl_Kenn CO_Konz CO_Kenn
# <dbl> <dbl> <chr> <dbl> <chr>
# 1 1 4 "" 4 ""
# 2 2 5 "" 1 ""
# 3 3 NA "X" NA "BX"
# 4 4 5 "" 4 ""
# 5 5 NA "F" 4 ""
# 6 6 5 "" NA "EXr"

I recommend dplyr::if_else in place of ifelse for several reasons, but it comes with the strict (and safe!) requirement that the true= and false= arguments be precisely the same type. You recognize at least most of this by your use of NA_real_; my use of .[NA] is another way of ensuring that we get the correct NA-variant based on the actual data, allowing this method to work if some of your Params are integer and some are numeric, for example.

An alternative approach (which may help later) is to pivot the data and work with just two columns at a time.

library(tidyr) # pivot_longer
Test %>%
pivot_longer(
matches("_(Konz|Kenn)$"),
names_pattern = "(.*)_(.*)", names_to = c("elem", ".value")
) %>%
mutate(
Konz = if_else(grepl("[XF]", Kenn), Konz[NA], Konz)
)
# # A tibble: 12 x 4
# Date elem Konz Kenn
# <dbl> <chr> <dbl> <chr>
# 1 1 HCl 4 ""
# 2 1 CO 4 ""
# 3 2 HCl 5 ""
# 4 2 CO 1 ""
# 5 3 HCl NA "X"
# 6 3 CO NA "BX"
# 7 4 HCl 5 ""
# 8 4 CO 4 ""
# 9 5 HCl NA "F"
# 10 5 CO 4 ""
# 11 6 HCl 5 ""
# 12 6 CO NA "EXr"

This pivoted format has the advantage of allowing simpler calls to mutate, and (if you plan on plotting this) playing much better with ggplot2's preference for long data.

dynamicaly name a new variable / column within a custom function dplyr mutate and paste

We may use the arguments as unquoted and use {{}} for evaluation

my_fun <- function(dataf, V1, V2){
dataf %>%
dplyr::mutate("{{V1}}_{{V2}}" := paste0(format({{V1}}, big.mark = ",") ,
'\n(' , format({{V2}}, big.mark = ",") , ')'))
}

-testing

my_fun(df, speed1, n1)
string speed1 speed2 n1 n2 speed1_n1
1 car 7886.962 3218.585 37 83 7,886.962\n(37)
2 train 9534.978 5524.649 98 34 9,534.978\n(98)
3 bike 6984.790 9476.838 60 55 6,984.790\n(60)
4 plain 6543.198 2638.609 9 53 6,543.198\n( 9)

R/dplyr: Mutate based on multiple dynamic variable names

Great question. Below is a base R solution. I am sure it can be adapted to a tidyverse solution (e.g., with purrr::map2()). Here I built a function that does a basic test and then used it with lapply(). Note: the answer is tailored for your example, so you'll need to adapt it if you have different column names for the value / units. Hope this helps!!

val_by_unit <- function(data) {

df <- data[order(names(data))]

# Selecting columns for values and units
val <- df[endsWith(names(df), "area")]
unit <- df[endsWith(names(df), "unit")]

# Check names are multiplying correctly
if(!all(names(val) == sub("_unit", "", names(unit)))) {
stop("Not all areas have a corresponding unit")
}

# Multiplying corresponding columns
output <- Map(`*`, val, unit)

# Renaming output and adding columns
data[paste0(names(output), "_ha")] <- output
data
}

Results:

lapply(ab_list, val_by_unit)

$a
# A tibble: 3 x 7
a1_area a2_area_unit a2_area a1_area_unit abc a1_area_ha a2_area_ha
<dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 1 1 1 1 1 1 1
2 1 1 1 0.5 2 0.5 1
3 1 0.5 1 0.5 3 0.5 0.5

$b
# A tibble: 3 x 7
b1_area b1_area_unit b2_area b2_area_unit abc b1_area_ha b2_area_ha
<dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 1 1 1 1 1 1 1
2 1 1 1 0.5 2 1 0.5
3 1 0.5 1 0.5 3 0.5 0.5


Related Topics



Leave a reply



Submit