How to create an R function programmatically?
This is an expansion on the discussion here.
Our three pieces need to be an argument list, a body and an environment.
For the environment, we will simply use env = parent.frame()
by default.
We do not really want a regular old list for the arguments, so instead we use alist
which has some different behavior:
"...values are not evaluated, and tagged arguments with no value are allowed"
args <- alist(a = 1, b = 2)
For the body, we quote
our expression to get a call
:
body <- quote(a + b)
One option is to convert args
to a pairlist and then simply call the function function
using eval
:
make_function1 <- function(args, body, env = parent.frame()) {
args <- as.pairlist(args)
eval(call("function", args, body), env)
}
Another option is to create an empty function, and then fill it with the desired values:
make_function2 <- function(args, body, env = parent.frame()) {
f <- function() {}
formals(f) <- args
body(f) <- body
environment(f) <- env
f
}
A third option is to simply use as.function
:
make_function3 <- function(args, body, env = parent.frame()) {
as.function(c(args, body), env)
}
And finally, this seems very similar to the first method to me, except
we are using a somewhat different idiom to create the function call, usingsubstitute
rather than call
:
make_function4 <- function(args, body, env = parent.frame()) {
subs <- list(args = as.pairlist(args), body = body)
eval(substitute(`function`(args, body), subs), env)
}
library(microbenchmark)
microbenchmark(
make_function1(args, body),
make_function2(args, body),
make_function3(args, body),
make_function4(args, body),
function(a = 1, b = 2) a + b
)
Unit: nanoseconds
expr min lq median uq max
1 function(a = 1, b = 2) a + b 187 273.5 309.0 363.0 673
2 make_function1(args, body) 4123 4729.5 5236.0 5864.0 13449
3 make_function2(args, body) 50695 52296.0 53423.0 54782.5 147062
4 make_function3(args, body) 8427 8992.0 9618.5 9957.0 14857
5 make_function4(args, body) 5339 6089.5 6867.5 7301.5 55137
R: programmatically create a function call
Update 4.29.17 -- Soon to be released dplyr 0.6.0 will address these issues. New answers have been added to questions below. For more on programming with dplyr, see this vignette.
You have the right idea. You may be able to shorten the code a bit with ?filter_
and the dots argument ...
:
select_data <- function(x, ...) {
kall <- list(...)
filter_(.data=x, paste0(names(kall), "==", unlist(kall), collapse="&"))
}
select_data(DF, A = 2, C = 2)
# A B C
# 1 2 2 2
update
Programming with dplyr
can be very challenging even for intermediate coders. The author has confessed that the advantages of non-standard evaluation come with the cost of difficulty with respect to functional programming. There are a number of SO users running into the same problem:
standard evaluation in dplyr
dplyr function does not work
Using dplyr functions within another function
Major dplyr functions in a function
Pass arguments to dplyr functions
dplyr: filter where two columns in data.frame are equal
Steps have been taken to solve these problems. There is a vignette to outline the basic fixes. In my humble opinion though, the vignette is deficient of explanation of functional programming. There isn't one function written as an example. Nor does it address any confounding examples that typically show up. Hopefully, with the increase in calls to fix NSE, we may eventually get a sufficient response.
As a final example of the mess that non-standard evaluation can cause in programming, I tried to work on a solution for this user for awhile to no avail. It is simply asking to use summarise
programmatically:
Sub-function in grouping function using dplyr
How to programmatically create and assign variables inside an R function
In what follows I've commented out the Quandl
related stuff just to test the save
part of the code. So one of the arguments passed on to the function is a dummy argument. For your real use case remove the comment characters and it will do what you need. A file is created in the working directory and the test dataset data
is saved under the argument Quandl_FRED_identifier = "x"
and returned by the function.
retrieve_FRED_data <-
function(dataID = Quandl_FRED_Identifier, filename = assigned_name) {
# require(Quandl)
# Quandl.api_key("xxxxxxxxxxxx")
# data <- Quandl(paste(c("FRED/", dataID), collapse = ""))
assign(dataID, data)
save(list = dataID, file = paste0(filename, ".Rdata"))
return(get(dataID))
}
data <- data.frame(X = 1:5, Y = 6:10)
Quandl_FRED_Identifier <- "x" # name of the object (df) to be saved
assigned_name <- "so" # so = stackoverflow, obviously
retrieve_FRED_data(Quandl_FRED_Identifier, assigned_name)
# X Y
#1 1 6
#2 2 7
#3 3 8
#4 4 9
#5 5 10
Note also that you can use paste0(filename, ".RData")
which is simpler but I haven't changed that part, I only did the changes strictly needed to make the function save the file.
EDIT.
There was a missing argument list
to save
. It seems to be working correctly now.
x
# X Y
#1 1 6
#2 2 7
#3 3 8
#4 4 9
#5 5 10
rm(x)
load("so.RData")
x
# X Y
#1 1 6
#2 2 7
#3 3 8
#4 4 9
#5 5 10
EDIT 2.
The function is creating files with objects named according to the argument passed to it. Considering the mess in the comments, I will repeat part of the code above, namely, the first call to the function. It will create a file so.RData
with a df named x
. The second call creates a file so2.RData
with df y
. Then, I remove both x
and y
. When the files are load
ed the df's are in the globalenv
as expected.
data <- data.frame(X = 1:5, Y = 6:10)
Quandl_FRED_Identifier <- "x" #
assigned_name <- "so" # so = stackoverflow, obviously
retrieve_FRED_data(Quandl_FRED_Identifier, assigned_name)
set.seed(1)
data <- data.frame(A = rnorm(5), B = rnorm(5))
Quandl_FRED_Identifier <- "y" #
assigned_name <- "so2" # so = stackoverflow, obviously
retrieve_FRED_data(Quandl_FRED_Identifier, assigned_name)
rm(x, y)
load("so.RData")
load("so2.RData")
x
y
Create an R function programmatically with non-fixed body
Since you have a string, you need to parse that string. You can do
assign("Function_1",
as.function(c(args, parse(text=body1)[[1]])),
env = parent.frame())
Though I would strongly discourage the use of assign
for filling your global environment with a bunch of variables with indexes in their name. In general that makes things much tougher to program with. It would be much easier to collect all your functions in a list. For example
funs <- lapply(1:dim(A.m)[1], function(i) {
body <- ""
for(l in 1:dim(A.m)[2]) {
body <- paste0(body,"A.m[",i,",",l,"]","*x[",l,"]+")
}
body <- substr(body,1,nchar(body)-1)
body <- parse(text=body)[[1]]
as.function(c(alist(x=), body), env=parent.frame())
})
And then you can call the different functions by extracting them with [[]]
funs[[1]](x=c(1,1,1))
# [1] 9
funs[[2]](x=c(1,1,1))
# [1] 4
Or you can ever call all the functions with an lapply
lapply(funs, function(f, ...) f(...), x=c(1,1,1))
# [[1]]
# [1] 9
# [[2]]
# [1] 4
# [[3]]
# [1] 6
Although if this is actually what your function is doing, there are easier ways to do this in R using matrix multiplication %*%
. Your Function_1 is the same as A.m[1,] %*% c(1,1,1)
. You could make a generator funciton like
colmult <- function(mat, row) {
function(x) {
as.numeric(mat[row,] %*% x)
}
}
And then create the same list of functions with
funs <- lapply(1:3, function(i) colmult(A.m, i))
Then you don't need any string building or parsing which tends to be error prone.
Automatic creation and use of custom made function in R
You can use a combination of eval
and parse
function to call (evaluate) any string as code. First, you have to construct such a string. For this, you can specify your indexes as character strings. For example: IDX1 = "a + b"
. Then, you can get that value by name with get("IDX1")
.
Try this code:
# Your preparations
a <- 1
b <- 2
c <- 5
d <- 8
IDX1 <- "a + b"
IDX2 <- "c + b"
IDX3 <- "d + b - c"
IDX4 <- "a + d + b + c"
head = c("iter", "IndexA", "IndexB")
r1 = c(1, "IDX1", "IDX2")
r2 = c(2, "IDX3", "IDX4")
r3 = c(3, "IDX1", "IDX4")
df = as.data.frame(rbind(r1, r2, r3))
colnames(df) <- head
# Loop over df with apply
result <- apply(df, 1, function(x){
# Construct call string for further evaluation
IndexA_call <- paste("ia <- ", get(x[2]), sep = "")
IndexB_call <- paste("ib <- ", get(x[3]), sep = "")
# eval each call string
eval(parse(text = IndexA_call))
eval(parse(text = IndexB_call))
x[2:3] <- c(ia, ib)
return(as.numeric(x))
})
result <- t(result)
colnames(result) <- head
print(result)
This gives:
iter IndexA IndexB
r1 1 3 7
r2 2 5 16
r3 3 3 16
R - Creating a function from a string
You should generally avoid parsing arbitrary string input from users. Usually, this can be avoided by appropriate software design.
Anyway, just extract the language from the expression:
make_function(alist(a = 1, b = 2), parse(text = 'a^b')[[1]])
Edit:
Just to show how you can check against a whitelist (no regex involved):
whitelist <- c("+", "*", "-", "/", "^", "**", "%%", "%/%", "sin", "cos", "tan", "abs") #etc.
expr <- parse(text = "cos(x)^sin(x)*abs(x)")
foo <- function(e) if (length(e) > 1) lapply(as.list(e), foo) else return(e)
funs <- unlist(foo(expr[[1]]))
funs <- funs[vapply(funs, function(x) {x <- eval(x); is.function(x) | is.primitive(x)}, FUN.VALUE = TRUE)]
all(vapply(funs, function(x) as.character(x) %in% whitelist, FUN.VALUE = TRUE))
Parsing and evaluating arbitrary code in a public shiny app is a security risk. This check ensures that only a predefined set of functions can be used. If you don't shy away from smaller risks, you could just use a blacklist instead (prohibiting functions like system
, system2
, shell
etc.).
How do I call the `function` function?
This is because function
is a special primitive:
typeof(`function`)
#> [1] "special"
The arguments are not evaluated, so you have actually passed quote(formals(mean))
instead of the value of formals(mean)
. I don't think there's a way of calling function
directly without evaluation tricks, except with an empty formals list which is just NULL
.
Related Topics
Pasting Two Vectors With Combinations of All Vectors' Elements
How to Use Facets With a Dual Y-Axis Ggplot
Dplyr::Select Function Clashes With Mass::Select
Applying a Function to Every Row of a Table Using Dplyr
Read All Files in a Folder and Apply a Function to Each Data Frame
R Conditional Evaluation When Using the Pipe Operator %≫%
Gradient of N Colors Ranging from Color 1 and Color 2
Proper/Fastest Way to Reshape a Data.Table
R.Exe, Rcmd.Exe, Rscript.Exe and Rterm.Exe: What's the Difference
Dplyr Join on By=(A = B), Where a and B Are Variables Containing Strings
How to Use Grep()/Gsub() to Find Exact Match
How to Tell What Is in One Vector and Not Another
Merging Two Data Frames Using Fuzzy/Approximate String Matching in R
Intelligent Point Label Placement in R
How to Quickly Form Groups (Quartiles, Deciles, etc) by Ordering Column(S) in a Data Frame
Use a Value from the Previous Row in an R Data.Table Calculation