Dynamic "String" in R

Dynamic string in R

We can use paste:

Df <- sqlQuery(ch, paste("SELECT * FROM tblTest WHERE Id =", Id))

c concatenates into a vector, paste is for string concatenation.

Or we can use sprintf:

sprintf("SELECT * FROM tblTest WHERE Id = %s", Id)

R connect string and dynamic variables

Try this:

e <- .GlobalEnv
i <- 1
xi.name <- paste0("x", i)

# assign
e[[xi.name]] <- 3

# add
e[[xi.name]] <- e[[xi.name]] + 1

# display
e[[xi.name]]
## [1] 4

or using assign and get the above could be done like this:

i <- 1
xi.name <- paste0("x", i)

# assign
assign(xi.name, 3)

# add
assign(xi.name, get(xi.name) + 1)

# display
get(xi.name)
## [1] 4

Note that normally one does not generate dynamic variables but rather puts them into a list.

L <- list()
i <- 1
xi.name <- paste0("x", i)

# assign
L[[xi.name]] <- 3

# add
L[[xi.name]] <- L[[xi.name]] + 1

# display
L[[xi.name]]
## [1] 4

or simply:

L <- list()
i <- 1

# assign
L[[i]] <- 3

# add
L[[i]] <- L[[i]] + 1

# display
L[[i]]
## [1] 4

Note

e <- .GlobalEnv
i <- 1
xi.name <- paste0("x", i)
x1 <- 3

e[[xi.name]] <- c(e[[xi.name]], 99)
x1
## [1] 3 99

e <- .GlobalEnv
i <- 1
xi.name <- paste0("x", i)
x1 <- 3
assign(xi.name, c(get(xi.name), 99))
x1
## [1] 3 99

R: Dynamically referencing and operating on variables in data frame

If you want to pass unquoted column names, you could use deparse substitute like :

add5 <- function(data, var){
out <- data[deparse(substitute(var))] + 5
return(out)
}

add5(mydat,x)

Using dplyr and some non-standard evaluation with curly-curly we can do :

library(dplyr)
library(rlang)

add5 <- function(data, var){
data %>% mutate(out = {{var}} + 5)
}

add5(mydat, x)
# x out
#1 1.1604 6.16
#2 0.7002 5.70
#3 1.5868 6.59
#4 0.5585 5.56
#5 -1.2766 3.72
#6 -0.5733 4.43
#7 -1.2246 3.78
#8 -0.4734 4.53
#9 -0.6204 4.38
#10 0.0421 5.04

How can I dynamically build a string and pass it to dplyr's mutate() function in R?

Generating and evaluating the string

Q1 = 1, Q2 = 2, Q3 = 3, Q4 = 4 is not a string in the same way that "Q1 = 1, Q2 = 2, Q3 = 3, Q4 = 4" is a string. There are some R functions that will take a string object and evaluate it as code. For example:

> eval(parse(text="print('hello world')"))

#> [1] "hello world"

However, this may not play nicely inside dbplyr translation. If you manage to get something like this approach working it would be good to see it posted as an answer.

Using a loop

Instead of doing it as a single string, an alternative is to use a loop:

db <- tbl(con, "mtcars") %>%
select(carb) %>%
distinct(carb) %>%
arrange(carb)

for(i in 1:n){
var = paste0("Q",i)
db <- db %>%
mutate(!!sym(var) := i)
}

db <- collect(db)

The !!sym() is required to tell dplyr that you want the text argument treated as a variable. Lazy evaluation can give you odd results without it. The := assignment is required because the LHS needs to be evaluated.

This approach is roughly equivalent to one mutate statement for each variable (example below), but the dbplyr translation might not look as elegant as doing it all within a single mutate statement.

db <- tbl(con, "mtcars") %>%
select(carb) %>%
distinct(carb) %>%
arrange(carb) %>%
mutate(Q1 = 1) %>%
mutate(Q2 = 2) %>%
...
mutate(Qn = n) %>%
collect()

Use dynamic (variable) string as regex pattern in R

If your words are indeed just separated by a space, I would split them to columns, convert to long format and then, run a binary join combined with by = .EACHI, e.g., using your data:

library(data.table)
library(magrittr)
DT_strings[, tstrsplit(string, " ", fixed = TRUE)] %>%
melt(., measure.vars = names(.), na.rm = TRUE) %>%
.[DT_words, on = .(value = word), .N, by = .EACHI]
# value N
# 1: word1 2
# 2: word2 3
# 3: word3 1
# 4: word4 0

P.S.

I've used fixed = TRUE for speed as I assumed there is always once space between each word. In case the # of spaces varies, you'll need to use tstrsplit(string, "\\s+") instead which will be probably slower.

How to dynamically name variables in formula in lm() function?

The core issue to understand here is that lm() takes a type formula as the first parameter that specifies the regression.

You've created a vector of strings (characters) but R won't dynamically generated the formula for you in the function call - the ability to just type variable names as a formula is a convenience but not practical when you are attempting to be dynamic.

To simplify your example, start with:

y1 <- (rnorm(n = 10, mean = 0, sd = 1))
x1 <- (rnorm(n = 10, mean = 0, sd = 1))
x2 <- (rnorm(n = 10, mean = 0, sd = 1))
x3 <- (rnorm(n = 10, mean = 0, sd = 1))

df <- as.data.frame(cbind(y1,x1,x2,x3))

predictors = c("x1", "x2", "x3")

Now you can dynamically create a formula as as concatenated string (paste0) and convert it to a formula. Then pass this formula to your lm() call:

form1 = as.formula(paste0("y1~", predictors[1]))

lm(form1, data = df)

As akrun pointed out, you can then start doing things like create loops to dynamically generate these.

You can also do things like:

my_formula = as.formula(paste0("y1~", paste0(predictors, collapse="+")))

## generates y1 ~ x1 + x2 + x3
lm(my_formula, data = df)

See also: Formula with dynamic number of variables

One of the answers on that page also mentions akrun's alternative way of doing this, using the function reformulate. From ?reformulate:

reformulate creates a formula from a character vector. If length(termlabels) > 1, its elements are concatenated with +. Non-syntactic names (e.g. containing spaces or special characters; see make.names) must be protected with backticks (see examples). A non-parseable response still works for now, back compatibly, with a deprecation warning.

Dynamically create and evaluate function in R

One option is to use get() to retrieve the appropriate function:

join <- function(JOINTYPE) {
get( paste0(JOINTYPE, "_join") )
}

join("inner")(penguins, penguin_colors, by="species")

If using rlang, the more appropriate function here is rlang::exec:

join2 <- function(JOINTYPE, ...) {
rlang::exec( paste0(JOINTYPE, "_join"), ... )
}

join2("inner", penguins, penguin_colors, by="species")

How to get dynamic variable in string in r?

fn$ is discussed in Example 6 on the sqldf home page. Here is a self contained minimial reproducible example using the iris data frame that comes with R: (In the future please ensure all code is minimal and reproducible and in particular includes all inputs).

library(sqldf)

# retrieve records for specified Species and Petal.Length above minPetalLength
f <- function(Species, minPetalLength) {
fn$sqldf("SELECT *
FROM iris
WHERE Species = '$Species' and [Petal.Length] > $minPetalLength")
}

f("virginica", 6)

giving:

  Sepal.Length Sepal.Width Petal.Length Petal.Width   Species
1 7.6 3.0 6.6 2.1 virginica
2 7.3 2.9 6.3 1.8 virginica
3 7.2 3.6 6.1 2.5 virginica
4 7.7 3.8 6.7 2.2 virginica
5 7.7 2.6 6.9 2.3 virginica
6 7.7 2.8 6.7 2.0 virginica
7 7.4 2.8 6.1 1.9 virginica
8 7.9 3.8 6.4 2.0 virginica
9 7.7 3.0 6.1 2.3 virginica


Related Topics



Leave a reply



Submit