dplyr string as column reference
Here's an option that uses interp()
from the lazyeval package, which came with your dplyr install. Inside your function(s), you'll need to use the standard evaluation version of the dplyr functions. In this case that would be mutate_()
.
Note that the new column position
will be identical to the Cost
column here because of how you've set up the grouping in machines
. The second call to my_fun()
shows it working on a different set of grouping variables.
library(dplyr)
library(lazyeval)
my_fun <- function(data, col) {
mutate_(data, position = interp(~ cumsum(x), x = as.name(col)))
}
my_fun(machines, "Cost")
# Date Model.Num Cost position
# 1 1/31/2014 123 200 200
# 2 1/31/2014 456 300 300
# 3 2/28/2014 123 250 250
# 4 2/28/2014 456 350 350
# 5 3/31/2014 123 300 300
# 6 3/31/2014 456 400 400
## second example - different grouping
my_fun(group_by(machines, Model.Num), "Cost")
# Date Model.Num Cost position
# 1 1/31/2014 123 200 200
# 2 1/31/2014 456 300 300
# 3 2/28/2014 123 250 450
# 4 2/28/2014 456 350 650
# 5 3/31/2014 123 300 750
# 6 3/31/2014 456 400 1050
In R, dplyr mutate referencing column names by string
We can convert to sym
bol and evaluate with !!
library(dplyr)
mydf %>%
mutate(newCol = !! rlang::sym(var1) + !! rlang::sym(var2))
Or another option is subset the column with .data
mydf %>%
mutate(newCol = .data[[var1]] + .data[[var2]])
or may use rowSums
mydf %>%
mutate(newCol = rowSums(select(cur_data(), all_of(c(var1, var2)))))
refer to column name from variable in across in dplyr
Making use of the .data
pronoun from rlang
you could do:
library(dplyr)
m <- data.frame(x = 1:5, y = 11:15, z = 21:25)
denom <- "z"
m %>% mutate(across(
x:z,
list(~ log(.) - log(.data[[denom]]))
))
#> x y z x_1 y_1 z_1
#> 1 1 11 21 -3.044522 -0.6466272 0
#> 2 2 12 22 -2.397895 -0.6061358 0
#> 3 3 13 23 -2.036882 -0.5705449 0
#> 4 4 14 24 -1.791759 -0.5389965 0
#> 5 5 15 25 -1.609438 -0.5108256 0
Parsing string as column name in dplyr
I would use a named vector instead of trying to mess around with the dplyr programming nuances. A benefit is that this method is already vectorized.
rename_cols <- function(col) {
name = paste0(col, "_new") #I want to be able to parse this into the rename function below
mtcars %>%
rename(setNames(col, name))
}
rename_cols(colnames(mtcars))
# mpg_new cyl_new disp_new hp_new drat_new wt_new qsec_new vs_new am_new gear_new carb_new
# Mazda RX4 21.0 6 160.0 110 3.90 2.620 16.46 0 1 4 4
# Mazda RX4 Wag 21.0 6 160.0 110 3.90 2.875 17.02 0 1 4 4
# Datsun 710 22.8 4 108.0 93 3.85 2.320 18.61 1 1 4 1
# Hornet 4 Drive 21.4 6 258.0 110 3.08 3.215 19.44 1 0 3 1
# Hornet Sportabout 18.7 8 360.0 175 3.15 3.440 17.02 0 0 3 2
# Valiant 18.1 6 225.0 105 2.76 3.460 20.22 1 0 3 1
# ...
Edit
In this case, you might also find rename_with()
to be what you need.
library(dplyr)
colnames(mtcars) -> cols
mtcars %>%
rename_with(~ paste0(., "_new"), any_of(cols))
# which is the same as the more concise but maybe less clear...
mtcars %>%
rename_with(paste0, any_of(cols), "_new")
Pass a string as variable name in dplyr::filter
!!
or UQ
evaluates the variable, so mtcars %>% filter(!!var == 4)
is the same as mtcars %>% filter('cyl' == 4)
where the condition always evaluates to false; You can prove this by printing !!var
in the filter function:
mtcars %>% filter({ print(!!var); (!!var) == 4 })
# [1] "cyl"
# [1] mpg cyl disp hp drat wt qsec vs am gear carb
# <0 rows> (or 0-length row.names)
To evaluate var
to the cyl
column, you need to convert var
to a symbol of cyl
first, then evaluate the symbol cyl
to a column:
Using rlang
:
library(rlang)
var <- 'cyl'
mtcars %>% filter((!!sym(var)) == 4)
# mpg cyl disp hp drat wt qsec vs am gear carb
#1 22.8 4 108.0 93 3.85 2.320 18.61 1 1 4 1
#2 24.4 4 146.7 62 3.69 3.190 20.00 1 0 4 2
#3 22.8 4 140.8 95 3.92 3.150 22.90 1 0 4 2
# ...
Or use as.symbol/as.name
from baseR:
mtcars %>% filter((!!as.symbol(var)) == 4)
mtcars %>% filter((!!as.name(var)) == 4)
R dplyr operate on a column known only by its string name
If you have a column name in a string (aka character vector) and you want to use it with tidyeval, then you can covert it with rlang::sym()
. Just change
dplyr::filter( mpg > !!rlang::sym(probColName) )
and it should work. This is taken from the recommendation at this github issue: https://github.com/tidyverse/rlang/issues/116
It's still fine to use
dplyr::summarize( !!probColName := quantile(mpg, pctCutoff) )
because when dynamically setting a parameter name, you just need the string and not an unqouted symbol.
Pass character string of column names (e.g. c(speed, dist ) to `across` function in R
You can't use substitute()
or eval()
on character vectors. You need to parse those character vectors into language objects. Otherwise when you eval a string, you just get that string back. It's not like eval
in other languages. One way to do the parsing is str2lang
. Then you can inject that expression into the across
using tidy evaulation's !!
. For example
mtcars_2 %>%
mutate(across(.cols = !!str2lang(.$cols_to_modify),.fns = round))
Is it possible to name a column of a tibble using a variable containing a character vector (string)?
You can use the following solution:
- In order to have column names that are stored as string we make use of bang bang operator
!!
which forces the evaluation of it succeeding name - We also need to use walrus
:=
instead of=
which are equivalent and prompts you to supply name (as is the case with our variable name) on it LHS (left hand side)
CLADE_FIELD = "Clade"
LINEAGE_FIELD = "Lineage"
metaDF = tibble(!!CLADE_FIELD := c("G"),
!!LINEAGE_FIELD := c("B.666"),
"Submission date" = c("2020-03"))
# A tibble: 1 x 3
Clade Lineage `Submission date`
<chr> <chr> <chr>
1 G B.666 2020-03
Or we can use double braces {{}}
as follows:
metaDF = tibble({{CLADE_FIELD}} := c("G"),
{{LINEAGE_FIELD}} := c("B.666"),
"Submission date" = c("2020-03"))
# A tibble: 1 x 3
Clade Lineage `Submission date`
<chr> <chr> <chr>
1 G B.666 2020-03
Or we can make use of glue
syntax and put the variable name within a pair of braces {}
and pass the result as a string. Since glue syntax became available on the LHS of :=
whatever object (here your variable names) you put within a curly braces will be evaluated as R code:
metaDF = tibble("{CLADE_FIELD}" := c("G"),
"{LINEAGE_FIELD}" := c("B.666"),
"Submission date" = c("2020-03"))
# A tibble: 1 x 3
Clade Lineage `Submission date`
<chr> <chr> <chr>
1 G B.666 2020-03
Related Topics
Convert Sequence of Longitude and Latitude to Polygon via Sf in R
Spatialpolygons - Creating a Set of Polygons in R from Coordinates
Change the Number of Breaks Using Facet_Grid in Ggplot2
Extracting a Random Sample of Rows in a Data.Frame with a Nested Conditional
Reading Information from a Password Protected Site
Programming-Safe Version of Subset - to Evaluate Its Condition While Called from Another Function
Plotting Ordiellipse Function from Vegan Package Onto Nmds Plot Created in Ggplot2
How to Format Data for Plotly Sunburst Diagram
How to Add a Page Break in Word Document Generated by Rstudio & Markdown
How to Change the Now Deprecated Dplyr::Funs() Which Includes an Ifelse Argument
How to One-Hot-Encode Factor Variables with Data.Table
Format Text Inside R Code Chunk
Shiny Dashboard - Display a Dedicated "Loading.." Page Until Initial Loading of the Data Is Done
Shiny - Checkbox in Table in Shiny
What Are the Differences Between Concatenating Strings with Cat() and Paste()
Convert a Dataframe to an Object of Class "Dist" Without Actually Calculating Distances in R