Dplyr String as Column Reference

dplyr string as column reference

Here's an option that uses interp() from the lazyeval package, which came with your dplyr install. Inside your function(s), you'll need to use the standard evaluation version of the dplyr functions. In this case that would be mutate_().

Note that the new column position will be identical to the Cost column here because of how you've set up the grouping in machines. The second call to my_fun() shows it working on a different set of grouping variables.

library(dplyr)
library(lazyeval)

my_fun <- function(data, col) {
    mutate_(data, position = interp(~ cumsum(x), x = as.name(col)))
}

my_fun(machines, "Cost")
#        Date Model.Num Cost position
# 1 1/31/2014       123  200      200
# 2 1/31/2014       456  300      300
# 3 2/28/2014       123  250      250
# 4 2/28/2014       456  350      350
# 5 3/31/2014       123  300      300
# 6 3/31/2014       456  400      400

## second example - different grouping
my_fun(group_by(machines, Model.Num), "Cost")
#        Date Model.Num Cost position
# 1 1/31/2014       123  200      200
# 2 1/31/2014       456  300      300
# 3 2/28/2014       123  250      450
# 4 2/28/2014       456  350      650
# 5 3/31/2014       123  300      750
# 6 3/31/2014       456  400     1050

In R, dplyr mutate referencing column names by string

We can convert to symbol and evaluate with !!

library(dplyr)
mydf %>% 
  mutate(newCol = !! rlang::sym(var1) + !! rlang::sym(var2))

Or another option is subset the column with .data

mydf %>%
   mutate(newCol = .data[[var1]] + .data[[var2]])

or may use rowSums

mydf %>% 
   mutate(newCol = rowSums(select(cur_data(), all_of(c(var1, var2)))))

refer to column name from variable in across in dplyr

Making use of the .data pronoun from rlang you could do:

library(dplyr)

m <- data.frame(x = 1:5, y = 11:15, z = 21:25)
denom <- "z"

m %>% mutate(across(
  x:z,
  list(~ log(.) - log(.data[[denom]]))
))
#>   x  y  z       x_1        y_1 z_1
#> 1 1 11 21 -3.044522 -0.6466272   0
#> 2 2 12 22 -2.397895 -0.6061358   0
#> 3 3 13 23 -2.036882 -0.5705449   0
#> 4 4 14 24 -1.791759 -0.5389965   0
#> 5 5 15 25 -1.609438 -0.5108256   0

Parsing string as column name in dplyr

I would use a named vector instead of trying to mess around with the dplyr programming nuances. A benefit is that this method is already vectorized.

rename_cols <- function(col) {
  
  name = paste0(col, "_new") #I want to be able to parse this into the rename function below
  
  mtcars %>% 
    rename(setNames(col, name))
}

rename_cols(colnames(mtcars))
#                     mpg_new cyl_new disp_new hp_new drat_new wt_new qsec_new vs_new am_new gear_new carb_new
# Mazda RX4              21.0       6    160.0    110     3.90  2.620    16.46      0      1        4        4
# Mazda RX4 Wag          21.0       6    160.0    110     3.90  2.875    17.02      0      1        4        4
# Datsun 710             22.8       4    108.0     93     3.85  2.320    18.61      1      1        4        1
# Hornet 4 Drive         21.4       6    258.0    110     3.08  3.215    19.44      1      0        3        1
# Hornet Sportabout      18.7       8    360.0    175     3.15  3.440    17.02      0      0        3        2
# Valiant                18.1       6    225.0    105     2.76  3.460    20.22      1      0        3        1
# ...

Edit

In this case, you might also find rename_with() to be what you need.

library(dplyr)

colnames(mtcars) -> cols

mtcars %>% 
  rename_with(~ paste0(., "_new"), any_of(cols))

# which is the same as the more concise but maybe less clear...
mtcars %>% 
  rename_with(paste0, any_of(cols), "_new")

Pass a string as variable name in dplyr::filter

!! or UQ evaluates the variable, so mtcars %>% filter(!!var == 4) is the same as mtcars %>% filter('cyl' == 4) where the condition always evaluates to false; You can prove this by printing !!var in the filter function:

mtcars %>% filter({ print(!!var); (!!var) == 4 })
# [1] "cyl"
#  [1] mpg  cyl  disp hp   drat wt   qsec vs   am   gear carb
# <0 rows> (or 0-length row.names)

To evaluate var to the cyl column, you need to convert var to a symbol of cyl first, then evaluate the symbol cyl to a column:

Using rlang:

library(rlang)
var <- 'cyl'
mtcars %>% filter((!!sym(var)) == 4)

#    mpg cyl  disp  hp drat    wt  qsec vs am gear carb
#1  22.8   4 108.0  93 3.85 2.320 18.61  1  1    4    1
#2  24.4   4 146.7  62 3.69 3.190 20.00  1  0    4    2
#3  22.8   4 140.8  95 3.92 3.150 22.90  1  0    4    2
# ...

Or use as.symbol/as.name from baseR:

mtcars %>% filter((!!as.symbol(var)) == 4)

mtcars %>% filter((!!as.name(var)) == 4)

R dplyr operate on a column known only by its string name

If you have a column name in a string (aka character vector) and you want to use it with tidyeval, then you can covert it with rlang::sym(). Just change

dplyr::filter( mpg > !!rlang::sym(probColName) )

and it should work. This is taken from the recommendation at this github issue: https://github.com/tidyverse/rlang/issues/116

It's still fine to use

dplyr::summarize( !!probColName := quantile(mpg, pctCutoff) )

because when dynamically setting a parameter name, you just need the string and not an unqouted symbol.

Pass character string of column names (e.g. c(speed, dist ) to `across` function in R

You can't use substitute() or eval() on character vectors. You need to parse those character vectors into language objects. Otherwise when you eval a string, you just get that string back. It's not like eval in other languages. One way to do the parsing is str2lang. Then you can inject that expression into the across using tidy evaulation's !!. For example

mtcars_2 %>% 
  mutate(across(.cols = !!str2lang(.$cols_to_modify),.fns = round))

Is it possible to name a column of a tibble using a variable containing a character vector (string)?

You can use the following solution:

In order to have column names that are stored as string we make use of bang bang operator !! which forces the evaluation of it succeeding name
We also need to use walrus := instead of = which are equivalent and prompts you to supply name (as is the case with our variable name) on it LHS (left hand side)

CLADE_FIELD = "Clade"
LINEAGE_FIELD = "Lineage"

metaDF = tibble(!!CLADE_FIELD := c("G"), 
                !!LINEAGE_FIELD := c("B.666"), 
                "Submission date" = c("2020-03"))

# A tibble: 1 x 3
  Clade Lineage `Submission date`
  <chr> <chr>   <chr>            
1 G     B.666   2020-03

Or we can use double braces {{}} as follows:

metaDF = tibble({{CLADE_FIELD}} := c("G"), 
                {{LINEAGE_FIELD}} := c("B.666"), 
                "Submission date" = c("2020-03"))

# A tibble: 1 x 3
  Clade Lineage `Submission date`
  <chr> <chr>   <chr>            
1 G     B.666   2020-03

Or we can make use of glue syntax and put the variable name within a pair of braces {} and pass the result as a string. Since glue syntax became available on the LHS of := whatever object (here your variable names) you put within a curly braces will be evaluated as R code:

metaDF = tibble("{CLADE_FIELD}" := c("G"), 
                "{LINEAGE_FIELD}" := c("B.666"), 
                "Submission date" = c("2020-03"))

# A tibble: 1 x 3
  Clade Lineage `Submission date`
  <chr> <chr>   <chr>            
1 G     B.666   2020-03

Dplyr String as Column Reference