Programming-Safe Version of Subset - to Evaluate Its Condition While Called from Another Function

Programming-safe version of subset - to evaluate its condition while called from another function

Just because it's such mind-bending fun (??), here is a slightly different solution that addresses a problem Hadley pointed to in comments to my accepted solution.

Hadley posted a gist demonstrating a situation in which my accepted function goes awry. The twist in that example (copied below) is that a symbol passed to SUBSET() is defined in the body (rather than the arguments) of one of the calling functions; it thus gets captured by substitute() instead of the intended global variable. Confusing stuff, I know.

f <- function() {
cyl <- 4
g()
}

g <- function() {
SUBSET(mtcars, cyl == 4)$cyl
}
f()

Here is a better function that will only substitute the values of symbols found in calling functions' argument lists. It works in all of the situations that Hadley or I have so far proposed.

SUBSET <- function(`_dat`, expr) {
ff <- sys.frames()
n <- length(ff)
ex <- substitute(expr)
ii <- seq_len(n)
for(i in ii) {
## 'which' is the frame number, and 'n' is # of frames to go back.
margs <- as.list(match.call(definition = sys.function(n - i),
call = sys.call(sys.parent(i))))[-1]
ex <- eval(substitute(substitute(x, env = ll),
env = list(x = ex, ll = margs)))
}
`_dat`[eval(ex, envir = `_dat`),]
}

## Works in Hadley's counterexample ...
f()
# [1] 4 4 4 4 4 4 4 4 4 4 4

## ... and in my original test cases.
sub <- function(x, condition) SUBSET(x, condition)
sub2 <- function(AA, BB) sub(AA, BB)

a <- SUBSET(mtcars, cyl == 4) ## Direct call to SUBSET()
b <- sub(mtcars, cyl == 4) ## SUBSET() called one level down
c <- sub2(mtcars, cyl == 4)
all(identical(a, b), identical(b, c))
# [1] TRUE

IMPORTANT: Please note that this still is not (nor can it be made into) a generally useful function. There's simply no way for the function to know which symbols you want it to use in all of the substitutions it performs as it works up the call stack. There are many situations in which users would want it to use the values of symbols assigned to within function bodies, but this function will always ignore those.

how to pass an expression through a function for the subset function to evaluate in R

What about:

lsubset <- 
function( x , ... ){
lapply( x , subset , ... )
}

lsubset( mtlist , gear == 4 )

Declaring a function so that one of its arguments determines the stopping condition of a while loop (R)

You could use an expression, and evaluate it in the while condition:

i <- NA
my_function = function(stop_at, units) {
if (units == "iterations") {
i <-0
condition = expression(i < stop_at)
} else {
condition = expression(difftime(Sys.time(), starting_time, units = units) < stop_at)
}

while (eval(condition)) {
#Main Act#
if (!is.na(i)) i <- i + 1
}
}
}

Or one step further to handle i+1 in the condition:

my_function = function(stop_at, units) {
if (units == "iterations") {
i <-0
condition = expression({i <- i + 1; i < stop_at + 1})
} else {
condition = expression(difftime(Sys.time(), starting_time, units = units) < stop_at)
}

while (eval(condition)) {
#Main Act#
}
}

Avoiding redundancy when selecting rows in a data frame

You can use with

For instance

sel.ID <- with(long_data_frame_name, col1==2 & col2<0.5 & col3>0.2)
selected <- long_data_frame_name[sel.ID, selected_columns]

determine whether evaluation of an argument will fail due to non-existence

I like Josh O'Brien's approach. However, using f as defined there, the "non-existence" error condition can be triggered even when the passed name does in fact have an active binding, if it was created local to some function within the call stack (rather than in the global environment). Also, though perhaps not specifically relevant to Ben's intended usage, the error will be triggered if the argument is an expression, not just a name.

A simple fix is to tweak Josh's function to include an is.symbol test:

f <- function(d) {
## test here
ff <- sys.frames()
ex <- substitute(d)
ii <- rev(seq_along(ff))
for(i in ii) {
ex <- eval(substitute(substitute(x, env=sys.frames()[[n]]),
env = list(x = ex, n=i)))
}
if(is.symbol(ex) && !exists(deparse(ex))) {
stop("Substitute real error action here")
}
eval(d)
}

The desired check still works:

f2 <- function(ddd) {
f(ddd)
}
f2(junk)
## Error in f(ddd) : Substitute real error action here

But the following two cases now pass through rather than yielding the error:

# case 1: argument to f is local to a calling function
f3 <- function() {
notjunk <- 999
f(notjunk)
}
f3()
## [1] 999

# case 2: argument to f is an expression
f2(5+5)
## [1] 10

What's happening in f is that after the repeated application of call-substitute-substitute, ex is set to the evaluated argument itself in case 1 above, and to the passed (albeit still unevaluated) call in case 2. In both cases, the exists test alone would fail because ex is not actually a name (aka symbol), but clearly non-existence is not a concern if we've been able to resolve beyond the name.

What is purpose of dot before variables (i.e. variables ) in the R Plyr package?

There may be two things going on that are confusing you.

One is the . function in the 'plyr' package. The . function allows you to use a variable as a link rather than referring to the value(s) the variable contains. For instance, in some functions, we want to refer to the object x rather than the value(s) stored in x. In the 'base' package, there is no easy, concise way of doing this, so we use the 'plyr' package to say .(x). The 'plyr' functions themselves use this a lot like so:

ddply(data, .(row_1), summarize, total=sum(row_1))

If we didn't use the . function, 'ddply' would complain, because 'row_1' contains many values, when we really just want to refer to the object.

The other "." in action here is the way people use it as a character in the function arguments' names. I'm not sure what the origin is, but a lot of people seem to do it just to highlight which variables are function arguments and which variables are only part of the function's internal code. The "." is just another character, in this case.

How do I replace while loops with a functional programming alternative without tail call optimization?

An example in JavaScript

Here's an example using JavaScript. Currently, most browsers do not support tail call optimisation and therefore the following snippet will fail

const repeat = n => f => x =>

n === 0 ? x : repeat (n - 1) (f) (f(x))



console.log(repeat(1e3) (x => x + 1) (0)) // 1000

console.log(repeat(1e5) (x => x + 1) (0)) // Error: Uncaught RangeError: Maximum call stack size exceeded


Related Topics



Leave a reply



Submit