Dynamic string in R
We can use paste:
Df <- sqlQuery(ch, paste("SELECT * FROM tblTest WHERE Id =", Id))
c
concatenates into a vector, paste
is for string concatenation.
Or we can use sprintf:
sprintf("SELECT * FROM tblTest WHERE Id = %s", Id)
R connect string and dynamic variables
Try this:
e <- .GlobalEnv
i <- 1
xi.name <- paste0("x", i)
# assign
e[[xi.name]] <- 3
# add
e[[xi.name]] <- e[[xi.name]] + 1
# display
e[[xi.name]]
## [1] 4
or using assign
and get
the above could be done like this:
i <- 1
xi.name <- paste0("x", i)
# assign
assign(xi.name, 3)
# add
assign(xi.name, get(xi.name) + 1)
# display
get(xi.name)
## [1] 4
Note that normally one does not generate dynamic variables but rather puts them into a list.
L <- list()
i <- 1
xi.name <- paste0("x", i)
# assign
L[[xi.name]] <- 3
# add
L[[xi.name]] <- L[[xi.name]] + 1
# display
L[[xi.name]]
## [1] 4
or simply:
L <- list()
i <- 1
# assign
L[[i]] <- 3
# add
L[[i]] <- L[[i]] + 1
# display
L[[i]]
## [1] 4
Note
e <- .GlobalEnv
i <- 1
xi.name <- paste0("x", i)
x1 <- 3
e[[xi.name]] <- c(e[[xi.name]], 99)
x1
## [1] 3 99
e <- .GlobalEnv
i <- 1
xi.name <- paste0("x", i)
x1 <- 3
assign(xi.name, c(get(xi.name), 99))
x1
## [1] 3 99
R: Dynamically referencing and operating on variables in data frame
If you want to pass unquoted column names, you could use deparse
substitute
like :
add5 <- function(data, var){
out <- data[deparse(substitute(var))] + 5
return(out)
}
add5(mydat,x)
Using dplyr
and some non-standard evaluation with curly-curly we can do :
library(dplyr)
library(rlang)
add5 <- function(data, var){
data %>% mutate(out = {{var}} + 5)
}
add5(mydat, x)
# x out
#1 1.1604 6.16
#2 0.7002 5.70
#3 1.5868 6.59
#4 0.5585 5.56
#5 -1.2766 3.72
#6 -0.5733 4.43
#7 -1.2246 3.78
#8 -0.4734 4.53
#9 -0.6204 4.38
#10 0.0421 5.04
How can I dynamically build a string and pass it to dplyr's mutate() function in R?
Generating and evaluating the string
Q1 = 1, Q2 = 2, Q3 = 3, Q4 = 4
is not a string in the same way that "Q1 = 1, Q2 = 2, Q3 = 3, Q4 = 4"
is a string. There are some R functions that will take a string object and evaluate it as code. For example:
> eval(parse(text="print('hello world')"))
#> [1] "hello world"
However, this may not play nicely inside dbplyr
translation. If you manage to get something like this approach working it would be good to see it posted as an answer.
Using a loop
Instead of doing it as a single string, an alternative is to use a loop:
db <- tbl(con, "mtcars") %>%
select(carb) %>%
distinct(carb) %>%
arrange(carb)
for(i in 1:n){
var = paste0("Q",i)
db <- db %>%
mutate(!!sym(var) := i)
}
db <- collect(db)
The !!sym()
is required to tell dplyr
that you want the text argument treated as a variable. Lazy evaluation can give you odd results without it. The :=
assignment is required because the LHS needs to be evaluated.
This approach is roughly equivalent to one mutate statement for each variable (example below), but the dbplyr
translation might not look as elegant as doing it all within a single mutate statement.
db <- tbl(con, "mtcars") %>%
select(carb) %>%
distinct(carb) %>%
arrange(carb) %>%
mutate(Q1 = 1) %>%
mutate(Q2 = 2) %>%
...
mutate(Qn = n) %>%
collect()
Use dynamic (variable) string as regex pattern in R
If your words are indeed just separated by a space, I would split them to columns, convert to long format and then, run a binary join combined with by = .EACHI
, e.g., using your data:
library(data.table)
library(magrittr)
DT_strings[, tstrsplit(string, " ", fixed = TRUE)] %>%
melt(., measure.vars = names(.), na.rm = TRUE) %>%
.[DT_words, on = .(value = word), .N, by = .EACHI]
# value N
# 1: word1 2
# 2: word2 3
# 3: word3 1
# 4: word4 0
P.S.
I've used fixed = TRUE
for speed as I assumed there is always once space between each word. In case the # of spaces varies, you'll need to use tstrsplit(string, "\\s+")
instead which will be probably slower.
How to dynamically name variables in formula in lm() function?
The core issue to understand here is that lm()
takes a type formula
as the first parameter that specifies the regression.
You've created a vector of strings (characters) but R won't dynamically generated the formula for you in the function call - the ability to just type variable names as a formula is a convenience but not practical when you are attempting to be dynamic.
To simplify your example, start with:
y1 <- (rnorm(n = 10, mean = 0, sd = 1))
x1 <- (rnorm(n = 10, mean = 0, sd = 1))
x2 <- (rnorm(n = 10, mean = 0, sd = 1))
x3 <- (rnorm(n = 10, mean = 0, sd = 1))
df <- as.data.frame(cbind(y1,x1,x2,x3))
predictors = c("x1", "x2", "x3")
Now you can dynamically create a formula as as concatenated string (paste0
) and convert it to a formula. Then pass this formula to your lm()
call:
form1 = as.formula(paste0("y1~", predictors[1]))
lm(form1, data = df)
As akrun pointed out, you can then start doing things like create loops to dynamically generate these.
You can also do things like:
my_formula = as.formula(paste0("y1~", paste0(predictors, collapse="+")))
## generates y1 ~ x1 + x2 + x3
lm(my_formula, data = df)
See also: Formula with dynamic number of variables
One of the answers on that page also mentions akrun's alternative way of doing this, using the function reformulate
. From ?reformulate
:
reformulate creates a formula from a character vector. If length(termlabels) > 1, its elements are concatenated with +. Non-syntactic names (e.g. containing spaces or special characters; see make.names) must be protected with backticks (see examples). A non-parseable response still works for now, back compatibly, with a deprecation warning.
Dynamically create and evaluate function in R
One option is to use get()
to retrieve the appropriate function:
join <- function(JOINTYPE) {
get( paste0(JOINTYPE, "_join") )
}
join("inner")(penguins, penguin_colors, by="species")
If using rlang
, the more appropriate function here is rlang::exec
:
join2 <- function(JOINTYPE, ...) {
rlang::exec( paste0(JOINTYPE, "_join"), ... )
}
join2("inner", penguins, penguin_colors, by="species")
How to get dynamic variable in string in r?
fn$
is discussed in Example 6 on the sqldf home page. Here is a self contained minimial reproducible example using the iris
data frame that comes with R: (In the future please ensure all code is minimal and reproducible and in particular includes all inputs).
library(sqldf)
# retrieve records for specified Species and Petal.Length above minPetalLength
f <- function(Species, minPetalLength) {
fn$sqldf("SELECT *
FROM iris
WHERE Species = '$Species' and [Petal.Length] > $minPetalLength")
}
f("virginica", 6)
giving:
Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1 7.6 3.0 6.6 2.1 virginica
2 7.3 2.9 6.3 1.8 virginica
3 7.2 3.6 6.1 2.5 virginica
4 7.7 3.8 6.7 2.2 virginica
5 7.7 2.6 6.9 2.3 virginica
6 7.7 2.8 6.7 2.0 virginica
7 7.4 2.8 6.1 1.9 virginica
8 7.9 3.8 6.4 2.0 virginica
9 7.7 3.0 6.1 2.3 virginica
Related Topics
How to Order by With Union in Sql
Using Union and Order by Clause in MySQL
How to Return Multiple Values in One Column (T-Sql)
Why Does MySQL Allow "Group By" Queries Without Aggregate Functions
Query With Left Join Not Returning Rows For Count of 0
Transpose Latest Rows Per User to Columns
Combining Union and Limit Operations in MySQL Query
Doing a Where .. in Subquery in Doctrine 2
Add Foreign Key Relationship Between Two Databases
How to Roll Back Create Table and Alter Table Statements in Major SQL Databases
Difference Between Primary Key and Unique Key
Error 1046 No Database Selected, How to Resolve
Why No Windowed Functions in Where Clauses
Creating a "Numbers Table" in MySQL
Delete Duplicate Records in SQL Server