What Is Your Preferred Style for Naming Variables in R

What is your preferred style for naming variables in R?

Good previous answers so just a little to add here:

  • underscores are really annoying for ESS users; given that ESS is pretty widely used you won't see many underscores in code authored by ESS users (and that set includes a bunch of R Core as well as CRAN authors, excptions like Hadley notwithstanding);

  • dots are evil too because they can get mixed up in simple method dispatch; I believe I once read comments to this effect on one of the R list: dots are a historical artifact and no longer encouraged;

  • so we have a clear winner still standing in the last round: camelCase. I am also not sure if I really agree with the assertion of 'lacking precendent in the R community'.

And yes: pragmatism and consistency trump dogma. So whatever works and is used by colleagues and co-authors. After all, we still have white-space and braces to argue about :)

Are there any official naming conventions for R?

The R Developer Page contains "more or less finalized ideas and plans for the R statistical system" from R-core. It does not contain any information about naming conventions. A brief look at the core R code will confirm this.

In R where does a variable named starting with a dot stores?

> ls(all.names = TRUE, envir = .GlobalEnv)
[1] ".Random.seed" ".var" "a"

Look at the man page for ls() by typing ?"ls" in the console

Below is the quote from the man page for all.names argument to be passed into ls(). The environment to look for can be controlled by the envir argument of ls command.

By default, ls(all.names = TRUE) will search for objects in the global environment.

all.names: a logical value. If TRUE, all object names are returned. If FALSE, names which begin with a . are omitted.

Also by passing the environment value to name argument, one can list all visible and hidden objects of that environment.

search()
ls(name = .GlobalEnv, all.names = TRUE)
ls(name = "package:base", all.names = TRUE)

What is the most efficient way to select a set of variable names from an R data.frame?

I'm personally a fan of the myvars <- c(...) and then using mydf[,myvars] from there on in.

However this still requires you to enter the initial variable names (even though just once), and as far as I read your question, it is this initial 'picking variable names' that is what you're asking about.

Re a simple no-frills GUI device -- I've recently been introduced to the menu function, which is exactly a simple no-frills GUI device for selecting one object out of a list of choices. Try menu(names(df),graphics=TRUE) to see what I mean (returns the column number). It even gives a nice text interface if for some reason your system can't do the graphics (try with graphics=FALSE to see what I mean).

However this is of limited use to you, as you can only select one column name.
To select multiple, you can use select.list (mentioned in ?menu as the alternative to make multiple selections):

# example with iris data (I don't have 'psych' package):
vars <- select.list(names(iris),multiple=TRUE,
title='select your variable names',
graphics=TRUE)

This also takes a graphics=TRUE option (single click on all the items you want to select).
It returns the names of the variables.

Do you follow the naming convention of the original programmer?

Yes, I do. It makes it easier to follow by the people who inherit it after you. I do try and clean up the code a little to make it more readable if it's really difficult to understand.

int i vs int index etc. Which one is better?

i, however, is pretty standard in terms of the first loop, followed by j for an inner loop and k for an inner-inner loop, and so on.

As with almost all naming rules, as long as it is standard within a project, and works well for all members thereof, then it's fine.

Elegant R function: mixed case separated by periods to underscore separated lower case and/or camel case

Try this. These at least work on the examples given:

toUnderscore <- function(x) {
x2 <- gsub("([A-Za-z])([A-Z])([a-z])", "\\1_\\2\\3", x)
x3 <- gsub(".", "_", x2, fixed = TRUE)
x4 <- gsub("([a-z])([A-Z])", "\\1_\\2", x3)
x5 <- tolower(x4)
x5
}

underscore2camel <- function(x) {
gsub("_(.)", "\\U\\1", x, perl = TRUE)
}

#######################################################
# test
#######################################################

u <- toUnderscore(as.Given)
u
## [1] "icu_days" "sex_code" "max_of_mld" "age_group"

underscore2camel(u)
## [1] "icuDays" "sexCode" "maxOfMld" "ageGroup"


Related Topics



Leave a reply



Submit