Can You More Clearly Explain Lazy Evaluation in R Function Operators

Can you more clearly explain lazy evaluation in R function operators?

When you do

what_is_love <- function(f) {
function(...) {
cat('f is', f, '\n')
}
}

the inner function creates an enclosure for f, but the catch is that until you actually use a variable passed to a function, it remains a "promise" and is not actually evaluated. If you want to "capture" the current value of f, then you need to force the evaluation of the promise; you can use the force() function fo this.

what_is_love <- function(f) {
force(f)
function(...) {
cat('f is', f, '\n')
}
}
funs <- lapply(c('love', 'cherry'), what_is_love)

funs[[1]]()
# f is love
funs[[2]]()
# f is cherry

Without force(), f remains a promise inside both of the functions in your list. It is not evaluated until you call the function, and when you call the function that promise is evaluated to the last known value for f which is "cherry."

As @MartinMorgran pointed out, this behavior has changed in R 3.2.0. From the release notes

Higher order functions such as the apply functions and Reduce()
now force arguments to the functions they apply in order to
eliminate undesirable interactions between lazy evaluation and
variable capture in closures. This resolves PR#16093.

Explain a lazy evaluation quirk

The goal of:

adders <- lapply(1:10, function(x)  add(x) )

is to create a list of add functions, the first adds 1 to its input, the second adds 2, etc. Lazy evaluation causes R to wait for really creating the adders functions until you really start calling the functions. The problem is that after creating the first adder function, x is increased by the lapply loop, ending at a value of 10. When you call the first adder function, lazy evaluation now builds the function, getting the value of x. The problem is that the original x is no longer equal to one, but to the value at the end of the lapply loop, i.e. 10.

Therefore, lazy evaluation causes all adder functions to wait until after the lapply loop has completed in really building the function. Then they build their function with the same value, i.e. 10. The solution Hadley suggests is to force x to be evaluated directly, avoiding lazy evaluation, and getting the correct functions with the correct x values.

Lazy evaluation of `which` function arguments?

With if there can be efficiency gains as a result of that behavior. It is documented to work that way, and I don't think it is due to lazy evaluation. Even if you "force()-ed" that expression it would still only evaluate a series of &'s until it had a single FALSE. See this help page:

?Logic

@XuWang probably deserved the credit for emphasizing the difference between "&" and "&&". The "&" operator works on vectors and returns vectors. The "&&" operator acts on scalars (actually vectors of length==1) and returns a vector of length== 1. When offered a vector or length >1 as either side of the arguments, it will work on only the information in the first value of each and emit a warning. It is only the "&&" version that does what is being called "lazy" evaluation. You can see that hte "&" operator is not acting in a "lazy fashion with a simepl test:

 fn1 <- function(x) print(x)
fn2 <- function(x) print(x)
x1 <- sample(c(TRUE, FALSE), 10, replace=TRUE)

fn1(x1) & fn2(x1) # the first two indicate evaluation of both sides regardless of first value
# [1] FALSE FALSE TRUE FALSE TRUE TRUE FALSE FALSE FALSE FALSE
# [1] FALSE FALSE TRUE FALSE TRUE TRUE FALSE FALSE FALSE FALSE
# [1] FALSE FALSE TRUE FALSE TRUE TRUE FALSE FALSE FALSE FALSE

Lazy evaluation and promise data structure

When an object such as y within h02 is created in a function it is created in the local execution frame/environment of that function (a new frame is created each time the function is run). The created object is distinct from an object of the same name in any other environment.

Regarding h03 once a promise is forced, i.e. evaluated, its value is stored in the promise's value component and its evaled component is set to TRUE so that upon further accesses it does not have to be evaluated again.

Arguments of functions are promises but normally not other objects. Use pryr to inspect objects.

library(pryr)

f <- function(x) {
z <- 1

cat("is_promise(z):", is_promise(z), "\n")

cat("is_promise(x):", is_promise(x), "\n")
cat("before forcing - promise_info(x):\n")
print(promise_info(x))

force(x)
cat("after forcing - promise_info(x):\n")
print(promise_info(x))

delayedAssign("w", 3)
cat("is_promise(w):", is_promise(w), "\n")

invisible()
}
a <- 3
f(a)

giving:

is_promise(z): FALSE 
is_promise(x): TRUE
before forcing - promise_info(x):
$code
a

$env
<environment: R_GlobalEnv>

$evaled
[1] FALSE

$value
NULL

after forcing - promise_info(x):
$code
a

$env
NULL

$evaled
[1] TRUE

$value
[1] 3

is_promise(w): TRUE

Lazy evaluation and hidden environments in R

Note that the environment of all internal functions is the local scope of MyFuncs:

MyFuncs <- (function(){

hidden <- function(){return('ninja')}
foo <- function(){paste(hidden(), 'foo')}
bar <- function(){paste(hidden(), 'bar')}
print(environment()) ## note I added this line
return(list(foo = foo, bar = bar))

})()

Will print (in this case where I've run it):

<environment: 0x7fb74acd00d8>

Additionally:

> environment(MyFuncs$foo)

<environment: 0x7fb74acd00d8>

> environment(MyFuncs$bar)

<environment: 0x7fb74acd00d8>

> environment(get("hidden", environment(MyFuncs$foo)))

<environment: 0x7fb74acd00d8>

> get("hidden", environment(MyFuncs$foo))()

[1] "ninja"

hidden is not evaluated until called by MyFuncs$foo() in the first instance, but since everything is contained in that local function scope there's no reason it can't exist.

Edit I didn't address the lazy evaluation issue explicitly, but as @MrFlick says this is usually applied to function arguments unless you invoke delayedAssign explicitly. hidden is assigned, just not evaluated until it's called from foo or bar. The environment of the function MyFuncs is indeed "hidden" in the sense that it's not on the search path, but this can be changed.

We can create an object that represents this namespace:

> env <- environment(MyFuncs$foo)
> foo()
Error: could not find function "foo"
> get("foo", env)()
[1] "ninja foo"

We can attach it to the search() path:

> attach(env, name="Myfuncs.foo")
> search()
[1] ".GlobalEnv" "Myfuncs.foo" [...]
> foo()
[1] "ninja foo"
> hidden()
[1] "ninja"

And detach it using the name we assigned:

> detach("Myfuncs.foo")

What is the Most Rly Way for Lazy Conditional Evaluation

Here I would use && instead of &. They differ in two ways (cf. ?"&&"):

The shorter form performs elementwise comparisons ...

and:

The longer form evaluates left to right examining only the first element of each vector. Evaluation proceeds only until the result is determined. The longer form is appropriate for programming control-flow and typically preferred in if clauses.

Example:

foo <- function()
if (exists("x") && (x)) cat("is TRUE\n") else cat("not existing or FALSE\n")

x <- TRUE
foo()

x <- FALSE
foo()

rm(x)
foo()

More can be found in this post.

Lazy evaluation example with NULL in Advanced R

This is just the way the && operator works. It's called short-circuiting and is separate from lazy evaluation.

Lazy evaluation refers to the way function arguments are evaluated. In particular arguments are only evaluated when (if) they are actually used in the function. For example, consider the following function

f <- function(a, b) NULL

that does nothing and returns NULL. The arguments a and b are never evaluated because they are unused. They don't appear in the body of f, so you can call f with any expressions you want (as long as it's syntactically correct) as arguments because the expressions won't be evaluated. E.g.

> f(1, 2)
NULL
> f(43$foo, unboundvariableblablabla)
NULL

Without lazy evaluation the arguments are evaluated first and then passed to the function, so the call above would fail because if you try to evaluate 43$foo you'll get an error

> 43$foo
Error in 43$foo : $ operator is invalid for atomic vectors

Explain a lazy evaluation quirk

The goal of:

adders <- lapply(1:10, function(x)  add(x) )

is to create a list of add functions, the first adds 1 to its input, the second adds 2, etc. Lazy evaluation causes R to wait for really creating the adders functions until you really start calling the functions. The problem is that after creating the first adder function, x is increased by the lapply loop, ending at a value of 10. When you call the first adder function, lazy evaluation now builds the function, getting the value of x. The problem is that the original x is no longer equal to one, but to the value at the end of the lapply loop, i.e. 10.

Therefore, lazy evaluation causes all adder functions to wait until after the lapply loop has completed in really building the function. Then they build their function with the same value, i.e. 10. The solution Hadley suggests is to force x to be evaluated directly, avoiding lazy evaluation, and getting the correct functions with the correct x values.

How to write an R function that evaluates an expression within a data-frame

The lattice package does this sort of thing in a different way. See, e.g., lattice:::xyplot.formula.

fn <- function(dat, expr) {
eval(substitute(expr), dat)
}
fn(df, a) # 1 2 3 4 5
fn(df, 2 * a + b) # 3 6 9 12 15


Related Topics



Leave a reply



Submit