Global Variables in Packages in R

Global variables in packages in R

In general global variables are evil. The underlying principle why they are evil is that you want to minimize the interconnections in your package. These interconnections often cause functions to have side-effects, i.e. it depends not only on the input arguments what the outcome is, but also on the value of some global variable. Especially when the number of functions grows, this can be hard to get right and hell to debug.

For global variables in R see this SO post.

Edit in response to your comment:
An alternative could be to just pass around the needed information to the functions that need it. You could create a new object which contains this info:

token_information = list(token1 = "087091287129387",
token2 = "UA2329723")

and require all functions that need this information to have it as an argument:

do_stuff = function(arg1, arg2, token)
do_stuff(arg1, arg2, token = token_information)

In this way it is clear from the code that token information is needed in the function, and you can debug the function on its own. Furthermore, the function has no side effects, as its behavior is fully determined by its input arguments. A typical user script would look something like:

token_info = create_token(token1, token2)
do_stuff(arg1, arg2, token_info)

I hope this makes things more clear.

How to define hidden global variables inside R packages?

Thank you for sharing your packages @Dirk Eddelbuettel

The solution for my question is the following:

.pkgglobalenv <- new.env(parent=emptyenv())

exs.time.start<-function(){
assign("exs.time", proc.time()[3], envir=.pkgglobalenv)
return(invisible(NULL))
}

exs.time.stop<-function(restartTimer=TRUE){
if(exists('exs.time',envir=.pkgglobalenv)==FALSE){
stop("ERROR: exs.time was not found! Start timer with exs.time.start")
}
returnValue=proc.time()[3]-.pkgglobalenv$exs.time
if(restartTimer==TRUE){
assign("exs.time", proc.time()[3], envir=.pkgglobalenv)
}
message(paste0("INFO: Elapsed time ",returnValue, " seconds!"))
return(invisible(returnValue))
}
  • I've created an environment with new.env(), inside my R file, before my function definitions.
  • I've used assign() to access the environment and change the value of my global variable!

The variable is hidden and everything works fine! Thanks guys!

Define Global Variables when creating packages

There are standard ways to include data in a package - if you want some particular R object to be available to the user of the package, this is what you should do. Data is not limited to data frames and matrices - any R object(s) can be included.

If, on the other hand, your intention was to modify the global environment every time a a function is called, then you're doing it wrong. In R's functional programming paradigm, functions return objects that can be assigned into the global environment by the user. Objects don't just "appear" in the global environment, with the programmer hoping that the user both (a) knows to look for them and (b) didn't have any objects of the same name that they wanted to keep (because they just got overwritten). It is possible to write code like this (using <<- as in your question, or explicitly calling assign as in @abhiieor's answer), but it will probably not be accepted to CRAN as it violates CRAN policy.

Global variable in a package - which approach is more recommended?

Some packages use hidden variables (variables that begin with a .), like .Random.seed and .Last.value do in base R. In your package you could do

e <- new.env()
assign(".sessionId", "xyz123", envir = e)
ls(e)
# character(0)
ls(e, all = TRUE)
# [1] ".sessionId"

But in your package you don't need to assign e. You can use a .onLoad() hook to assign the variable upon loading the package.

.onLoad <- function(libname, pkgname) {
assign(".sessionId", "xyz123", envir = parent.env(environment()))
}

See this question and its answers for some good explanation on package variables.

Unit testing functions with global variables in R

I think you misunderstand what utils::globalVariables("COUNTS") does. It just declares that COUNTS is a global variable, so when the code analysis sees

addx <- function(x) {
COUNTS + x
}

it won't complain about the use of an undefined variable. However, it is up to you to actually create the variable, for example by an explicit

COUNTS <- 0

somewhere in your source. I think if you do that, you won't even need the utils::globalVariables("COUNTS") call, because the code analysis will see the global definition.

Where you would need it is when you're doing some nonstandard evaluation, so that it's not obvious where a variable comes from. Then you declare it as a global, and the code analysis won't worry about it. For example, you might get a warning about

subset(df, Col1 < 0)

because it appears to use a global variable named Col1, but of course that's fine, because the subset() function evaluates in a non-standard way, letting you include column names without writing df$Col.

Global variables in R

As Christian's answer with assign() shows, there is a way to assign in the global environment. A simpler, shorter (but not better ... stick with assign) way is to use the <<- operator, ie

    a <<- "new" 

inside the function.



Related Topics



Leave a reply



Submit