R Scoping: Disallow Global Variables in Function

R scoping: disallow global variables in function

My other answer is more about what approach you can take inside your function. Now I'll provide some insight on what to do once your function is defined.

To ensure that your function is not using global variables when it shouldn't be, use the codetools package.

library(codetools)

sUm <- 10
f <- function(x, y) {
sum = x + y
return(sUm)
}

checkUsage(f)

This will print the message:

<anonymous> local variable ‘sum’ assigned but may not be used (:1)

To see if any global variables were used in your function, you can compare the output of the findGlobals() function with the variables in the global environment.

> findGlobals(f)
[1] "{" "+" "=" "return" "sUm"

> intersect(findGlobals(f), ls(envir=.GlobalEnv))
[1] "sUm"

That tells you that the global variable sUm was used inside f() when it probably shouldn't have been.

Global variable only within scope of R function

I am not sure what you'd like to achieve with your helper functions, but as @Marius mentioned in the comment, the inner functions should already have access to M. Hence codes like this would work:

f1 <- function(M) {
f2 <- function() {
f3 <- function() {
# f3 divides M by 2
M/2
}
# f2 prints results from f3 and also times M by 2
print(f3())
print(M * 2)
}
# f1 returns results from f2
return(f2())
}

mat <- matrix(1:4, 2)

f1(mat)
# [,1] [,2]
# [1,] 0.5 1.5
# [2,] 1.0 2.0
# [,1] [,2]
# [1,] 2 6
# [2,] 4 8

mat
# [,1] [,2]
# [1,] 1 3
# [2,] 2 4

There's no need to do X <<- M in f1 here, especially if you don't want a copy of M in memory.

Writing functions in R, keeping scoping in mind

If I know that I'm going to need a function parametrized by some values and called repeatedly, I avoid globals by using a closure:

make.fn2 <- function(a, b) {
fn2 <- function(x) {
return( x + a + b )
}
return( fn2 )
}

a <- 2; b <- 3
fn2.1 <- make.fn2(a, b)
fn2.1(3) # 8
fn2.1(4) # 9

a <- 4
fn2.2 <- make.fn2(a, b)
fn2.2(3) # 10
fn2.1(3) # 8

This neatly avoids referencing global variables, instead using the enclosing environment of the function for a and b. Modification of globals a and b doesn't lead to unintended side effects when fn2 instances are called.

Global variable and scope - R

Basically, since you are using the "<-" assignment, the function is creating a copy of the 'global' variable for use within the scope of the function.

This can be seen by adding in a second function g() which alters the value of 'global' before it is printed out in f(), but this time using the "<<-" assignment. The first line in f() creates your locally scoped copy of 'global' for f(x), and then you update the global copy of 'global' using g(x).

global <<- list()

f <- function(x) {
global[[x]] <- "blah"
g(x)
global
}

g <- function(x){
global[[x]] <<- "newblah"
}

f(1) #prints 'blah', despite the fact the g(x) has already updated the value

global #prints 'newblah'

If f(x) were still referencing the global copy of 'global' it would print "newblah" which was assigned in g(x). Instead it prints the value which was assigned in f(x) to the locally scoped copy of 'global'.

However, printing 'global' outside any function shows that g(x) did in fact update the value for the global copy of 'global'.

Now, if you move g(x) inside f(x), then f(x) is now the parent of g(x). In this case, "<<-" assigns to the value of 'global' that is within the scope of f(x). So the global copy of 'global' is still empty, but if you print out 'global' in the scope of f() you get the updated value.

global <<- list()

f <- function(x) {
global[[x]] <- "blah"

g <- function(x){
global[[x]] <<- "newblah"
}

g(x)
global
}

f(1) #prints 'newblah'

global #empty

Using global variable in function

Both <<- and assign will work:

myfunction <- function(var1, var2) {
# Modification of global mydata
mydata <<- ...
# Alternatively:
#assign('mydata', ..., globalenv())

# Assign locally as well
mydata <- mydata

# Definition of another variable with the new mydata
var3 <- ...

# Recursive function
mydata = myfunction(var2, var3)
}

That said, it’s almost always a bad idea to want to modify global data from a function, and there’s almost certainly a more elegant solution to this.

Furthermore, note that <<- is actually not the same as assigning to a variable in globalenv(), rather, it assigns to a variable in the parent scope, whatever that may be. For functions defined in the global environment, it’s the global environment. For functions defined elsewhere, it’s not the global environment.

How to limit the scope of the variables used in a script?

Just use the local=TRUE argument to source and evaluate source somewhere other than your global environment. Here are a few ways to do that (assuming you don't want to be able to access the variables in the script). foo.R just contains print(x <- 1:10).

do.call(source, list(file="c:/foo.R", local=TRUE), envir=new.env())
# [1] 1 2 3 4 5 6 7 8 9 10
ls()
# character(0)

mysource <- function() source("c:/foo.R", local=TRUE)
mysource()
# [1] 1 2 3 4 5 6 7 8 9 10
ls()
# [1] "mysource"

sys.source is probably the most straight-forward solution.

sys.source("c:/foo.R", envir=new.env())

You can also evaluate the file in an attached environment, in case you want to access the variables. See the examples in ?sys.source for how to do this.

Scoping assignment and local, bound and global variable in R

return_a is a function. set_a is a function. They are both functional objects (with associated environments, but using the word "variable" to describe them seems prone to confusion. If you call f, you get a list of twofunctions. When you create a list, there are not necessarily names to the list so p$set_a("Carl") throws an error because there is no p[['set_a']].

> p <- f("Justin"); p$set_a("Carl")
Error: attempt to apply non-function

But p[[2]] now returns a function and you need to call it:

>  p[[2]]
function(x)
a <<- x
<environment: 0x3664f6a28>

> p[[2]]("Carl")

That did change the value of the symbol-a in the environment of p[[1]]:

> p[[1]]()
[1] "Carl"

Global and local variables in R

Variables declared inside a function are local to that function. For instance:

foo <- function() {
bar <- 1
}
foo()
bar

gives the following error: Error: object 'bar' not found.

If you want to make bar a global variable, you should do:

foo <- function() {
bar <<- 1
}
foo()
bar

In this case bar is accessible from outside the function.

However, unlike C, C++ or many other languages, brackets do not determine the scope of variables. For instance, in the following code snippet:

if (x > 10) {
y <- 0
}
else {
y <- 1
}

y remains accessible after the if-else statement.

As you well say, you can also create nested environments. You can have a look at these two links for understanding how to use them:

  1. http://stat.ethz.ch/R-manual/R-devel/library/base/html/environment.html
  2. http://stat.ethz.ch/R-manual/R-devel/library/base/html/get.html

Here you have a small example:

test.env <- new.env()

assign('var', 100, envir=test.env)
# or simply
test.env$var <- 100

get('var') # var cannot be found since it is not defined in this environment
get('var', envir=test.env) # now it can be found

making sure a function does not use a global variable

There is a function findGlobals in the codetools package. Maybe this is helpful:

library(codetools)
x <- "global"
foo <- function() x

foo()
[1] "global"

findGlobals(foo)
[1] "x"


Related Topics



Leave a reply



Submit