R scoping: disallow global variables in function
My other answer is more about what approach you can take inside your function. Now I'll provide some insight on what to do once your function is defined.
To ensure that your function is not using global variables when it shouldn't be, use the codetools
package.
library(codetools)
sUm <- 10
f <- function(x, y) {
sum = x + y
return(sUm)
}
checkUsage(f)
This will print the message:
<anonymous> local variable ‘sum’ assigned but may not be used (:1)
To see if any global variables were used in your function, you can compare the output of the findGlobals()
function with the variables in the global environment.
> findGlobals(f)
[1] "{" "+" "=" "return" "sUm"
> intersect(findGlobals(f), ls(envir=.GlobalEnv))
[1] "sUm"
That tells you that the global variable sUm
was used inside f()
when it probably shouldn't have been.
Global variable only within scope of R function
I am not sure what you'd like to achieve with your helper functions, but as @Marius mentioned in the comment, the inner functions should already have access to M
. Hence codes like this would work:
f1 <- function(M) {
f2 <- function() {
f3 <- function() {
# f3 divides M by 2
M/2
}
# f2 prints results from f3 and also times M by 2
print(f3())
print(M * 2)
}
# f1 returns results from f2
return(f2())
}
mat <- matrix(1:4, 2)
f1(mat)
# [,1] [,2]
# [1,] 0.5 1.5
# [2,] 1.0 2.0
# [,1] [,2]
# [1,] 2 6
# [2,] 4 8
mat
# [,1] [,2]
# [1,] 1 3
# [2,] 2 4
There's no need to do X <<- M
in f1
here, especially if you don't want a copy of M
in memory.
Writing functions in R, keeping scoping in mind
If I know that I'm going to need a function parametrized by some values and called repeatedly, I avoid globals by using a closure:
make.fn2 <- function(a, b) {
fn2 <- function(x) {
return( x + a + b )
}
return( fn2 )
}
a <- 2; b <- 3
fn2.1 <- make.fn2(a, b)
fn2.1(3) # 8
fn2.1(4) # 9
a <- 4
fn2.2 <- make.fn2(a, b)
fn2.2(3) # 10
fn2.1(3) # 8
This neatly avoids referencing global variables, instead using the enclosing environment of the function for a and b. Modification of globals a and b doesn't lead to unintended side effects when fn2 instances are called.
Global variable and scope - R
Basically, since you are using the "<-" assignment, the function is creating a copy of the 'global' variable for use within the scope of the function.
This can be seen by adding in a second function g() which alters the value of 'global' before it is printed out in f(), but this time using the "<<-" assignment. The first line in f() creates your locally scoped copy of 'global' for f(x), and then you update the global copy of 'global' using g(x).
global <<- list()
f <- function(x) {
global[[x]] <- "blah"
g(x)
global
}
g <- function(x){
global[[x]] <<- "newblah"
}
f(1) #prints 'blah', despite the fact the g(x) has already updated the value
global #prints 'newblah'
If f(x) were still referencing the global copy of 'global' it would print "newblah" which was assigned in g(x). Instead it prints the value which was assigned in f(x) to the locally scoped copy of 'global'.
However, printing 'global' outside any function shows that g(x) did in fact update the value for the global copy of 'global'.
Now, if you move g(x) inside f(x), then f(x) is now the parent of g(x). In this case, "<<-" assigns to the value of 'global' that is within the scope of f(x). So the global copy of 'global' is still empty, but if you print out 'global' in the scope of f() you get the updated value.
global <<- list()
f <- function(x) {
global[[x]] <- "blah"
g <- function(x){
global[[x]] <<- "newblah"
}
g(x)
global
}
f(1) #prints 'newblah'
global #empty
Using global variable in function
Both <<-
and assign
will work:
myfunction <- function(var1, var2) {
# Modification of global mydata
mydata <<- ...
# Alternatively:
#assign('mydata', ..., globalenv())
# Assign locally as well
mydata <- mydata
# Definition of another variable with the new mydata
var3 <- ...
# Recursive function
mydata = myfunction(var2, var3)
}
That said, it’s almost always a bad idea to want to modify global data from a function, and there’s almost certainly a more elegant solution to this.
Furthermore, note that <<-
is actually not the same as assigning to a variable in globalenv()
, rather, it assigns to a variable in the parent scope, whatever that may be. For functions defined in the global environment, it’s the global environment. For functions defined elsewhere, it’s not the global environment.
How to limit the scope of the variables used in a script?
Just use the local=TRUE
argument to source
and evaluate source
somewhere other than your global environment. Here are a few ways to do that (assuming you don't want to be able to access the variables in the script). foo.R
just contains print(x <- 1:10)
.
do.call(source, list(file="c:/foo.R", local=TRUE), envir=new.env())
# [1] 1 2 3 4 5 6 7 8 9 10
ls()
# character(0)
mysource <- function() source("c:/foo.R", local=TRUE)
mysource()
# [1] 1 2 3 4 5 6 7 8 9 10
ls()
# [1] "mysource"
sys.source
is probably the most straight-forward solution.
sys.source("c:/foo.R", envir=new.env())
You can also evaluate the file in an attached environment, in case you want to access the variables. See the examples in ?sys.source
for how to do this.
Scoping assignment and local, bound and global variable in R
return_a
is a function. set_a
is a function. They are both functional objects (with associated environments, but using the word "variable" to describe them seems prone to confusion. If you call f
, you get a list of twofunctions. When you create a list, there are not necessarily names to the list so p$set_a("Carl")
throws an error because there is no p[['set_a']]
.
> p <- f("Justin"); p$set_a("Carl")
Error: attempt to apply non-function
But p[[2]] now returns a function and you need to call it:
> p[[2]]
function(x)
a <<- x
<environment: 0x3664f6a28>
> p[[2]]("Carl")
That did change the value of the symbol-a
in the environment of p[[1]]:
> p[[1]]()
[1] "Carl"
Global and local variables in R
Variables declared inside a function are local to that function. For instance:
foo <- function() {
bar <- 1
}
foo()
bar
gives the following error: Error: object 'bar' not found
.
If you want to make bar
a global variable, you should do:
foo <- function() {
bar <<- 1
}
foo()
bar
In this case bar
is accessible from outside the function.
However, unlike C, C++ or many other languages, brackets do not determine the scope of variables. For instance, in the following code snippet:
if (x > 10) {
y <- 0
}
else {
y <- 1
}
y
remains accessible after the if-else
statement.
As you well say, you can also create nested environments. You can have a look at these two links for understanding how to use them:
- http://stat.ethz.ch/R-manual/R-devel/library/base/html/environment.html
- http://stat.ethz.ch/R-manual/R-devel/library/base/html/get.html
Here you have a small example:
test.env <- new.env()
assign('var', 100, envir=test.env)
# or simply
test.env$var <- 100
get('var') # var cannot be found since it is not defined in this environment
get('var', envir=test.env) # now it can be found
making sure a function does not use a global variable
There is a function findGlobals
in the codetools
package. Maybe this is helpful:
library(codetools)
x <- "global"
foo <- function() x
foo()
[1] "global"
findGlobals(foo)
[1] "x"
Related Topics
Filtering Observations in Dplyr in Combination with Grepl
Using Parlapply and Clusterexport Inside a Function
How to Generate a Frequency Table in R with With Cumulative Frequency and Relative Frequency
Shared Memory in Parallel Foreach in R
Annotating Facet Title as Strip Over Facet
Condition a ..Count.. Summation on the Faceting Variable
How to Draw Two Half Circles in Ggplot in R
Arithmetic Mean on a Multidimensional Array on R and Matlab: Drastic Difference of Performances
Passing Large Matrices to Rcpparmadillo Function Without Creating Copy (Advanced Constructors)
Plotting a Large Number of Custom Functions in Ggplot in R Using Stat_Function()
Error with Ggplot2 Mapping Variable to Y and Using Stat="Bin"
R: Find and Add Missing (/Non Existing) Rows in Time Related Data Frame
Ggplot Graphing of Proportions of Observations Within Categories
How to Convert a String in a Function into an Object