Global variable and scope - R
Basically, since you are using the "<-" assignment, the function is creating a copy of the 'global' variable for use within the scope of the function.
This can be seen by adding in a second function g() which alters the value of 'global' before it is printed out in f(), but this time using the "<<-" assignment. The first line in f() creates your locally scoped copy of 'global' for f(x), and then you update the global copy of 'global' using g(x).
global <<- list()
f <- function(x) {
global[[x]] <- "blah"
g(x)
global
}
g <- function(x){
global[[x]] <<- "newblah"
}
f(1) #prints 'blah', despite the fact the g(x) has already updated the value
global #prints 'newblah'
If f(x) were still referencing the global copy of 'global' it would print "newblah" which was assigned in g(x). Instead it prints the value which was assigned in f(x) to the locally scoped copy of 'global'.
However, printing 'global' outside any function shows that g(x) did in fact update the value for the global copy of 'global'.
Now, if you move g(x) inside f(x), then f(x) is now the parent of g(x). In this case, "<<-" assigns to the value of 'global' that is within the scope of f(x). So the global copy of 'global' is still empty, but if you print out 'global' in the scope of f() you get the updated value.
global <<- list()
f <- function(x) {
global[[x]] <- "blah"
g <- function(x){
global[[x]] <<- "newblah"
}
g(x)
global
}
f(1) #prints 'newblah'
global #empty
Global and local variables in R
Variables declared inside a function are local to that function. For instance:
foo <- function() {
bar <- 1
}
foo()
bar
gives the following error: Error: object 'bar' not found
.
If you want to make bar
a global variable, you should do:
foo <- function() {
bar <<- 1
}
foo()
bar
In this case bar
is accessible from outside the function.
However, unlike C, C++ or many other languages, brackets do not determine the scope of variables. For instance, in the following code snippet:
if (x > 10) {
y <- 0
}
else {
y <- 1
}
y
remains accessible after the if-else
statement.
As you well say, you can also create nested environments. You can have a look at these two links for understanding how to use them:
- http://stat.ethz.ch/R-manual/R-devel/library/base/html/environment.html
- http://stat.ethz.ch/R-manual/R-devel/library/base/html/get.html
Here you have a small example:
test.env <- new.env()
assign('var', 100, envir=test.env)
# or simply
test.env$var <- 100
get('var') # var cannot be found since it is not defined in this environment
get('var', envir=test.env) # now it can be found
R scoping: disallow global variables in function
My other answer is more about what approach you can take inside your function. Now I'll provide some insight on what to do once your function is defined.
To ensure that your function is not using global variables when it shouldn't be, use the codetools
package.
library(codetools)
sUm <- 10
f <- function(x, y) {
sum = x + y
return(sUm)
}
checkUsage(f)
This will print the message:
<anonymous> local variable ‘sum’ assigned but may not be used (:1)
To see if any global variables were used in your function, you can compare the output of the findGlobals()
function with the variables in the global environment.
> findGlobals(f)
[1] "{" "+" "=" "return" "sUm"
> intersect(findGlobals(f), ls(envir=.GlobalEnv))
[1] "sUm"
That tells you that the global variable sUm
was used inside f()
when it probably shouldn't have been.
can lapply not modify variables in a higher scope
I discussed this issue in this related question: "Is R’s apply family more than syntactic sugar". You will notice that if you look at the function signature for for
and apply
, they have one critical difference: a for
loop evaluates an expression, while an apply
loop evaluates a function.
If you want to alter things outside the scope of an apply function, then you need to use <<-
or assign
. Or more to the point, use something like a for
loop instead. But you really need to be careful when working with things outside of a function because it can result in unexpected behavior.
In my opinion, one of the primary reasons to use an apply
function is explicitly because it doesn't alter things outside of it. This is a core concept in functional programming, wherein functions avoid having side effects. This is also a reason why the apply
family of functions can be used in parallel processing (and similar functions exist in the various parallel packages such as snow).
Lastly, the right way to run your code example is to also pass in the parameters to your function like so, and assigning back the output:
mat <- matrix(0,nrow=10,ncol=1)
mat <- matrix(lapply(1:10, function(i, mat) { mat[i,] <- rnorm(1,mean=i)}, mat=mat))
It is always best to be explicit about a parameter when possible (hence the mat=mat
) rather than inferring it.
Output selected variables to global environment R function
It is not recommended to write to global environment from inside the function. If you want to create multiple objects in the global environment return a named list from the function and use list2env
.
mediansFunction <- function(x){
labmedians <- sapply(x[-1], median)
median_of_median <- median(labmedians)
grand_median <- median(as.matrix(x[-1]))
labMscore <- as.vector(round(abs(scores_na(labmedians, "mad")), digits = 2)) #calculate mscore by lab
labMscoreIndex <- which(labMscore > MscoreMax) #get the position in the vector that exceeds Mscoremax
x[-1][labMscoreIndex] <- NA # discharge values above threshold by making NA
dplyr::lst(data = x, labmedians, grand_median, labMscore)
}
result <- mediansFunction(df)
list2env(result, .GlobalEnv)
Now you have variables data
, labmedians
, grand_median
and labMscore
in the global environment.
Using global variable in function
Both <<-
and assign
will work:
myfunction <- function(var1, var2) {
# Modification of global mydata
mydata <<- ...
# Alternatively:
#assign('mydata', ..., globalenv())
# Assign locally as well
mydata <- mydata
# Definition of another variable with the new mydata
var3 <- ...
# Recursive function
mydata = myfunction(var2, var3)
}
That said, it’s almost always a bad idea to want to modify global data from a function, and there’s almost certainly a more elegant solution to this.
Furthermore, note that <<-
is actually not the same as assigning to a variable in globalenv()
, rather, it assigns to a variable in the parent scope, whatever that may be. For functions defined in the global environment, it’s the global environment. For functions defined elsewhere, it’s not the global environment.
function environment within lapply loop
I believe it is because function foo()
is evaluated in the environment in which it is defined. In your example foo()
is defined in global environment and therefore i
is not in scope. If you define foo()
within the anonymous function then i
appears to be evaluated correctly.
env.g <- environment()
invisible(lapply(1, FUN = function(i){
message('global env: exists(i) ', exists('i', envir = env.g))
message('lapply env: exists(i) ', exists('i'))
message(' ')
j <- i + 1
foo <- function(j){
message('foo env: exists(j) ', exists('j'))
message('foo env: exists(i) ', exists('i'))
i
}
foo(j)
}
))
#global env: exists(i) FALSE
#lapply env: exists(i) TRUE
#foo env: exists(j) TRUE
#foo env: exists(i) TRUE
Related Topics
Asymmetric Expansion of Ggplot Axis Limits
How to Paste Together the Elements of a Vector in R Without Using a Loop
The Perils of Aligning Plots in Ggplot
How to Train a Ml Model in Sparklyr and Predict New Values on Another Dataframe
Add Row in Each Group Using Dplyr and Add_Row()
Shiny - Checkbox in Table in Shiny
Use Rollapply and Zoo to Calculate Rolling Average of a Column of Variables
R Shiny Sliderinput with Restricted Range
Ggplot2: Flip Axes and Maintain Aspect Ratio of Data
R Subset with Condition Using %In% or ==. Which One Should Be Used
How to Extend Letters Past 26 Characters E.G., Aa, Ab, Ac...
Assigning Null to a List Element in R
Geom_Density to Match Geom_Histogram Binwitdh
Increase Space Between Bars in Ggplot
Dplyr Summarize with Subtotals