How to Use "≪≪-" (Scoping Assignment) in R

How do you use - (scoping assignment) in R?

<<- is most useful in conjunction with closures to maintain state. Here's a section from a recent paper of mine:

A closure is a function written by another function. Closures are
so-called because they enclose the environment of the parent
function, and can access all variables and parameters in that
function. This is useful because it allows us to have two levels of
parameters. One level of parameters (the parent) controls how the
function works. The other level (the child) does the work. The
following example shows how can use this idea to generate a family of
power functions. The parent function (power) creates child functions
(square and cube) that actually do the hard work.

power <- function(exponent) {
function(x) x ^ exponent
}

square <- power(2)
square(2) # -> [1] 4
square(4) # -> [1] 16

cube <- power(3)
cube(2) # -> [1] 8
cube(4) # -> [1] 64

The ability to manage variables at two levels also makes it possible to maintain the state across function invocations by allowing a function to modify variables in the environment of its parent. The key to managing variables at different levels is the double arrow assignment operator <<-. Unlike the usual single arrow assignment (<-) that always works on the current level, the double arrow operator can modify variables in parent levels.

This makes it possible to maintain a counter that records how many times a function has been called, as the following example shows. Each time new_counter is run, it creates an environment, initialises the counter i in this environment, and then creates a new function.

new_counter <- function() {
i <- 0
function() {
# do something useful, then ...
i <<- i + 1
i
}
}

The new function is a closure, and its environment is the enclosing environment. When the closures counter_one and counter_two are run, each one modifies the counter in its enclosing environment and then returns the current count.

counter_one <- new_counter()
counter_two <- new_counter()

counter_one() # -> [1] 1
counter_one() # -> [1] 2
counter_two() # -> [1] 1

Operator in R argument

<<- and <- are both assignment operators, but they are subtly different.

<- only applies to the local environment where it is used, so if you use it to assign a variable inside a function, that variable will not be available outside that function.

If you use <<- inside a function to declare a new variable with a name you haven't used anywhere else, it will create that variable in the global environment. If you use it to assign to an existing variable within your function (or any function which contains your function), it will be assigned to the existing variable instead.

It is almost always a bad idea to assign to the global environment from within a function. If you absolutely have to write variables from inside a function, it is better to use assign to write the variable to another persistent environment.

local_assign <- function() {a <- 1;}
global_assign <- function() {b <<- 1;}

local_assign()
global_assign()
a
# Error: object 'a' not found
b
# [1] 1

Strictly speaking does the scoping assignment - assign to the parent environment or global environment?

Try it out:

env = new.env()
env2 = new.env(parent = env)

local(x <<- 42, env2)
ls(env)
# character(0)
ls()
# [1] "env" "env2" "x"

But:

env$x = 1
local(x <<- 2, env2)
env$x
# [1] 2

… so <<- does walk up the entire chain of parent environments until it finds an existing object of the given name, and replaces that. However, if it doesn’t find any such object, it creates a new object in .GlobalEnv.

(The documentation states much the same. But in a case such as this nothing beats experimenting to gain a better understanding.)

lexical scoping and environments in R

You are confusing the "calling environment" with the "enclosing environment." Check out these terms in Hadley's book "Advanced R."

http://adv-r.had.co.nz/Environments.html

The "calling environment" is the environment from which a function was called, and is returned by the unfortunately-named function parent.frame. However, the calling environment is not used for lexical scoping.

The "enclosing environment" is the environment in which a function was created and is used for lexical scoping. You have created both func1 and func2 in the global environment. Therefore, the global environment is the "enclosing environment" for both functions and will be used for lexical scoping regardless of the calling environment!!

If you want func2 to use the execution environment of func1 for lexical scoping, you have (at least) two options. You can create func2 within func1

func1 <- function(vec) {

func2 <- function(foos) {
for (foo in foos)
print(eval(parse(text = foo)))
return(foos)
}

text3_obj <- 'text3'
vec <- c(vec, c('text3_obj'))
return(func2(vec))
}

then your test works as expected:

> text1_obj <- 'text1'
> text2_obj <- 'text2'
> func1(c('text1_obj', 'text2_obj'))
[1] "text1"
[1] "text2"
[1] "text3"
[1] "text1_obj" "text2_obj" "text3_obj"

Alternatively, you can create func2 and reassign it's "enclosing environment" from within func1.

func2 <- function(foos) {
for (foo in foos)
print(eval(parse(text = foo)))
return(foos)
}

func1 <- function(vec) {
text3_obj <- 'text3'
vec <- c(vec, c('text3_obj'))
environment(func2) <- environment()
return(func2(vec))
}

This will also work as expected.

An interesting tidbit I found while writing my demonstration code... It appears that when you re-assign the environment of func2 from within func1, R creates a copy of func2 in the execution environment of func1. By the time you get back to the console, the enclosing environment of the original func2 remains unchanged. Witness:

a = function() {
print(identical(environment(a), globalenv()))
}

b = function(x) {
environment(a) <- environment()
a()
}

Test a() and b():

> a()
[1] TRUE
> b()
[1] FALSE
> a()
[1] TRUE
>

This was not what I expected, but seems like really excellent behavior on the part of R. If this were not the case, the enclosing environment of a() would have been permanently changed to the execution environment of b(), and FALSE should have been returned the second time a() is called.

If fact, it turns out you can force the change to the original a() in the global environment using <<-:

a = function() {
print(identical(environment(a), globalenv()))
}

b = function(x) {
# set a variable in the execution environment of b() for use later...
montePython = "I'm not dead yet!!"
# change the enclosing environment of a() in the global environment
# rather than making a local copy of a() in b()'s execution environment.
environment(a) <<- environment()
a()
}

Test a() and b():

> a()
[1] TRUE
> b()
[1] FALSE
> a()
[1] FALSE
>

Interestingly, this means that the (normally temporary) execution environment of b() persists in memory even after b() terminates, because a() still references the environment, so it can't be garbage collected. Witness:

> environment(a)$montePython
[1] "I'm not dead yet!!"

Difference between - and -

The operator <<- is the parent scope assignment operator. It is used to make assignments to variables in the nearest parent scope to the scope in which it is evaluated. These assignments therefore "stick" in the scope outside of function calls. Consider the following code:

fun1 <- function() {
x <- 10
print(x)
}

> x <- 5 # x is defined in the outer (global) scope
> fun1()
[1] 10 # x was assigned to 10 in fun1()
> x
[1] 5 # but the global value of x is unchanged

In the function fun1(), a local variable x is assigned to the value 10, but in the global scope the value of x is not changed. Now consider rewriting the function to use the parent scope assignment operator:

fun2 <- function() {
x <<- 10
print(x)
}

> x <- 5
> fun2()
[1] 10 # x was assigned to 10 in fun2()
> x
[1] 10 # the global value of x changed to 10

Because the function fun2() uses the <<- operator, the assignment of x "sticks" after the function has finished evaluating. What R actually does is to go through all scopes outside fun2() and look for the first scope containing a variable called x. In this case, the only scope outside of fun2() is the global scope, so it makes the assignment there.

As a few have already commented, the <<- operator is frowned upon by many because it can break the encapsulation of your R scripts. If we view an R function as an isolated piece of functionality, then it should not be allowed to interfere with the state of the code which calls it. Abusing the <<- assignment operator runs the risk of doing just this.

What is the difference between assign() and - in R?

Thomas Lumley answers this in a superb post on r-help the other day. <<- is about the enclosing environment so you can do thing like this (and again, I quote his post from April 22 in this thread):

make.accumulator<-function(){
a <- 0
function(x) {
a <<- a + x
a
}
}

> f<-make.accumulator()
> f(1)
[1] 1
> f(1)
[1] 2
> f(11)
[1] 13
> f(11)
[1] 24

This is a legitimate use of <<- as "super-assignment" with lexical scope. And not simply to assign in the global environment. For that, Thomas has these choice words:

The Evil and Wrong use is to modify
variables in the global environment.

Very good advice.

Auto assignment

This is a mechanism for copying a value defined within a function into the global environment (or at least, somewhere within the stack of parent of environments): from ?"<<-"

The operators ‘<<-’ and ‘->>’ are normally only used in functions,
and cause a search to be made through parent environments for an
existing definition of the variable being assigned. If such a
variable is found (and its binding is not locked) then its value
is redefined, otherwise assignment takes place in the global
environment.

I don't think it's particularly good practice (R is a mostly-functional language, and it's generally better to avoid function side effects), but it does do something. (@Roland points out in comments and @BrianO'Donnell in his answer [quoting Thomas Lumley] that using <<- is good practice if you're using it to modify a function closure, as in demo(scoping). In my experience it is more often misused to construct global variables than to work cleanly with function closures.)

Consider this example, starting in an empty/clean environment:

f <- function() {
x <- 1 ## assignment
x <<- x ## global assignment
}

Before we call f():

x
## Error: object 'x' not found

Now call f() and try again:

f()
x
## [1] 1


Related Topics



Leave a reply



Submit