How do you use - (scoping assignment) in R?
<<-
is most useful in conjunction with closures to maintain state. Here's a section from a recent paper of mine:
A closure is a function written by another function. Closures are
so-called because they enclose the environment of the parent
function, and can access all variables and parameters in that
function. This is useful because it allows us to have two levels of
parameters. One level of parameters (the parent) controls how the
function works. The other level (the child) does the work. The
following example shows how can use this idea to generate a family of
power functions. The parent function (power
) creates child functions
(square
andcube
) that actually do the hard work.
power <- function(exponent) {
function(x) x ^ exponent
}
square <- power(2)
square(2) # -> [1] 4
square(4) # -> [1] 16
cube <- power(3)
cube(2) # -> [1] 8
cube(4) # -> [1] 64
The ability to manage variables at two levels also makes it possible to maintain the state across function invocations by allowing a function to modify variables in the environment of its parent. The key to managing variables at different levels is the double arrow assignment operator <<-
. Unlike the usual single arrow assignment (<-
) that always works on the current level, the double arrow operator can modify variables in parent levels.
This makes it possible to maintain a counter that records how many times a function has been called, as the following example shows. Each time new_counter
is run, it creates an environment, initialises the counter i
in this environment, and then creates a new function.
new_counter <- function() {
i <- 0
function() {
# do something useful, then ...
i <<- i + 1
i
}
}
The new function is a closure, and its environment is the enclosing environment. When the closures counter_one
and counter_two
are run, each one modifies the counter in its enclosing environment and then returns the current count.
counter_one <- new_counter()
counter_two <- new_counter()
counter_one() # -> [1] 1
counter_one() # -> [1] 2
counter_two() # -> [1] 1
Operator in R argument
<<-
and <-
are both assignment operators, but they are subtly different.
<-
only applies to the local environment where it is used, so if you use it to assign a variable inside a function, that variable will not be available outside that function.
If you use <<-
inside a function to declare a new variable with a name you haven't used anywhere else, it will create that variable in the global environment. If you use it to assign to an existing variable within your function (or any function which contains your function), it will be assigned to the existing variable instead.
It is almost always a bad idea to assign to the global environment from within a function. If you absolutely have to write variables from inside a function, it is better to use assign
to write the variable to another persistent environment.
local_assign <- function() {a <- 1;}
global_assign <- function() {b <<- 1;}
local_assign()
global_assign()
a
# Error: object 'a' not found
b
# [1] 1
Strictly speaking does the scoping assignment - assign to the parent environment or global environment?
Try it out:
env = new.env()
env2 = new.env(parent = env)
local(x <<- 42, env2)
ls(env)
# character(0)
ls()
# [1] "env" "env2" "x"
But:
env$x = 1
local(x <<- 2, env2)
env$x
# [1] 2
… so <<-
does walk up the entire chain of parent environments until it finds an existing object of the given name, and replaces that. However, if it doesn’t find any such object, it creates a new object in .GlobalEnv
.
(The documentation states much the same. But in a case such as this nothing beats experimenting to gain a better understanding.)
lexical scoping and environments in R
You are confusing the "calling environment" with the "enclosing environment." Check out these terms in Hadley's book "Advanced R."
http://adv-r.had.co.nz/Environments.html
The "calling environment" is the environment from which a function was called, and is returned by the unfortunately-named function parent.frame
. However, the calling environment is not used for lexical scoping.
The "enclosing environment" is the environment in which a function was created and is used for lexical scoping. You have created both func1
and func2
in the global environment. Therefore, the global environment is the "enclosing environment" for both functions and will be used for lexical scoping regardless of the calling environment!!
If you want func2
to use the execution environment of func1
for lexical scoping, you have (at least) two options. You can create func2
within func1
func1 <- function(vec) {
func2 <- function(foos) {
for (foo in foos)
print(eval(parse(text = foo)))
return(foos)
}
text3_obj <- 'text3'
vec <- c(vec, c('text3_obj'))
return(func2(vec))
}
then your test works as expected:
> text1_obj <- 'text1'
> text2_obj <- 'text2'
> func1(c('text1_obj', 'text2_obj'))
[1] "text1"
[1] "text2"
[1] "text3"
[1] "text1_obj" "text2_obj" "text3_obj"
Alternatively, you can create func2
and reassign it's "enclosing environment" from within func1
.
func2 <- function(foos) {
for (foo in foos)
print(eval(parse(text = foo)))
return(foos)
}
func1 <- function(vec) {
text3_obj <- 'text3'
vec <- c(vec, c('text3_obj'))
environment(func2) <- environment()
return(func2(vec))
}
This will also work as expected.
An interesting tidbit I found while writing my demonstration code... It appears that when you re-assign the environment of func2
from within func1
, R creates a copy of func2
in the execution environment of func1
. By the time you get back to the console, the enclosing environment of the original func2
remains unchanged. Witness:
a = function() {
print(identical(environment(a), globalenv()))
}
b = function(x) {
environment(a) <- environment()
a()
}
Test a()
and b()
:
> a()
[1] TRUE
> b()
[1] FALSE
> a()
[1] TRUE
>
This was not what I expected, but seems like really excellent behavior on the part of R. If this were not the case, the enclosing environment of a()
would have been permanently changed to the execution environment of b()
, and FALSE
should have been returned the second time a()
is called.
If fact, it turns out you can force the change to the original a()
in the global environment using <<-
:
a = function() {
print(identical(environment(a), globalenv()))
}
b = function(x) {
# set a variable in the execution environment of b() for use later...
montePython = "I'm not dead yet!!"
# change the enclosing environment of a() in the global environment
# rather than making a local copy of a() in b()'s execution environment.
environment(a) <<- environment()
a()
}
Test a()
and b()
:
> a()
[1] TRUE
> b()
[1] FALSE
> a()
[1] FALSE
>
Interestingly, this means that the (normally temporary) execution environment of b()
persists in memory even after b()
terminates, because a()
still references the environment, so it can't be garbage collected. Witness:
> environment(a)$montePython
[1] "I'm not dead yet!!"
Difference between - and -
The operator <<-
is the parent scope assignment operator. It is used to make assignments to variables in the nearest parent scope to the scope in which it is evaluated. These assignments therefore "stick" in the scope outside of function calls. Consider the following code:
fun1 <- function() {
x <- 10
print(x)
}
> x <- 5 # x is defined in the outer (global) scope
> fun1()
[1] 10 # x was assigned to 10 in fun1()
> x
[1] 5 # but the global value of x is unchanged
In the function fun1()
, a local variable x
is assigned to the value 10
, but in the global scope the value of x
is not changed. Now consider rewriting the function to use the parent scope assignment operator:
fun2 <- function() {
x <<- 10
print(x)
}
> x <- 5
> fun2()
[1] 10 # x was assigned to 10 in fun2()
> x
[1] 10 # the global value of x changed to 10
Because the function fun2()
uses the <<-
operator, the assignment of x
"sticks" after the function has finished evaluating. What R actually does is to go through all scopes outside fun2()
and look for the first scope containing a variable called x
. In this case, the only scope outside of fun2()
is the global scope, so it makes the assignment there.
As a few have already commented, the <<-
operator is frowned upon by many because it can break the encapsulation of your R scripts. If we view an R function as an isolated piece of functionality, then it should not be allowed to interfere with the state of the code which calls it. Abusing the <<-
assignment operator runs the risk of doing just this.
What is the difference between assign() and - in R?
Thomas Lumley answers this in a superb post on r-help the other day. <<-
is about the enclosing environment so you can do thing like this (and again, I quote his post from April 22 in this thread):
make.accumulator<-function(){
a <- 0
function(x) {
a <<- a + x
a
}
}
> f<-make.accumulator()
> f(1)
[1] 1
> f(1)
[1] 2
> f(11)
[1] 13
> f(11)
[1] 24
This is a legitimate use of <<-
as "super-assignment" with lexical scope. And not simply to assign in the global environment. For that, Thomas has these choice words:
The Evil and Wrong use is to modify
variables in the global environment.
Very good advice.
Auto assignment
This is a mechanism for copying a value defined within a function into the global environment (or at least, somewhere within the stack of parent of environments): from ?"<<-"
The operators ‘<<-’ and ‘->>’ are normally only used in functions,
and cause a search to be made through parent environments for an
existing definition of the variable being assigned. If such a
variable is found (and its binding is not locked) then its value
is redefined, otherwise assignment takes place in the global
environment.
I don't think it's particularly good practice (R is a mostly-functional language, and it's generally better to avoid function side effects), but it does do something. (@Roland points out in comments and @BrianO'Donnell in his answer [quoting Thomas Lumley] that using <<-
is good practice if you're using it to modify a function closure, as in demo(scoping)
. In my experience it is more often misused to construct global variables than to work cleanly with function closures.)
Consider this example, starting in an empty/clean environment:
f <- function() {
x <- 1 ## assignment
x <<- x ## global assignment
}
Before we call f()
:
x
## Error: object 'x' not found
Now call f()
and try again:
f()
x
## [1] 1
Related Topics
How to Make a List of Data Frames
Grouping Functions (Tapply, By, Aggregate) and the *Apply Family
How to Find the Difference in Value in Every Two Consecutive Rows in R
Append Data Frames Together in a for Loop
Ggplot With 2 Y Axes on Each Side and Different Scales
Simultaneously Merge Multiple Data.Frames in a List
How to Declare a Vector of Zeros in R
Add Row to a Data Frame With Total Sum for Each Column
Remove Total Value for One Column in Powerbi
How to Add a Row to Data Frame Based on a Condition
How to Give Subtitles for Subplot in Plot_Ly Using R
How to Find the Closest Date to a Given Date
Pass a String as Variable Name in Dplyr::Filter
Setting Individual Axis Limits With Facet_Wrap and Scales = "Free" in Ggplot2
Join 3 Columns of Different Lengths in R
Subtract Value from Previous Row by Group