Function commenting conventions in R
Updating this question December 2019 as the R-universe has changed since 2011 when originally written
My recommended resource is now http://r-pkgs.had.co.nz/
Original answer (links are mostly out of date)
The canonical way to document your functions and make them accessible to others is to make a package. In order for your package to pass the build checks, you have to supply sufficiently detailed help files for each of your functions / datasets.
Check out http://cran.r-project.org/doc/manuals/R-exts.html#Creating-R-packages
This blog post from Rob J Hyndman was very useful and one of the easiest for me to follow: http://robjhyndman.com/researchtips/building-r-packages-for-windows/
I've started using roxygen to assist in making & compiling packages as of late: http://roxygen.org/
Lots of good resources and people to help when you have questions!
how to make comments appear from custom functions
I see what you mean. If you write a customised function
foo = function(x,y) { ... }
Then you go foo(
and hit tab, the code completion pop-up menu will give you the options x =
and y =
. However, when you type an existing R function such as round(
, not only does tab give you the options, but there's an explanation beneath each variable, telling you its role in the function:
The only way I could think of doing this for your own functions is to package your functions in your own customised package, and to make sure the "help" documentations includes your functions' parameters. This is getting beyond the realm of a stackoverflow question, but I'll point you to a couple of blogs where I learned the basics of R packages.
The Not So Standard Deviation blog explains how to write a simple package with help documentation, which is precisely what you need to see your customised functions appear with explanations inside RStudio's autocomplete. In a nutshell, you'll need to install roxygen2
, devtools
and, with each customised function, you'll need to thoroughly comment the function like this :
(disclaimer: the goofy cat example is the blogger's, not mine)
Here's a more detailed tutorial on creating R packages, and here's another blog on getting organised with R packages. Good luck!
Documenting functions in an r script
You could do what you'd like with the help of the docstring package https://cran.r-project.org/package=docstring
It allows you to add roxygen style documentation within a function and view that documentation using the typical help file viewer all without needing to convert your code into a full package.
The vignette provides a good introduction to how to use the package https://cran.r-project.org/web/packages/docstring/vignettes/docstring_intro.html
Note: I am the author of the package so this is a bit of self promotion but it seems to be incredibly relevant to the question asked.
Best practices to comment R pipeline %%
Not really an answer, but too long for a comment--
I personally just put my comments in between commands in the pipe. For example:
object %>%
command1 %>%
#* Comment
command2 %>%
command3 %>%
#* Perhaps a
#* Really long
#* Comment
command4
The key, for me, is indenting your comment to the same level as the code it discusses so that I can visualize that it is part of a single block.
What does the dot mean in R – personal preference, naming convention or more?
A dot in function name can mean any of the following:
- nothing at all
- a separator between method and class in S3 methods
- to hide the function name
Possible meanings
1. Nothing at all
The dot in data.frame
doesn't separate data
from frame
, other than visually.
2. Separation of methods and classes in S3 methods
plot
is one example of a generic S3 method. Thus plot.lm
and plot.glm
are the underlying function definitions that are used when calling plot(lm(...))
or plot(glm(...))
3. To hide internal functions
When writing packages, it is sometimes useful to use leading dots in function names because these functions are somewhat hidden from general view. Functions that are meant to be purely internal to a package sometimes use this.
In this context, "somewhat hidden" simply means that the variable (or function) won't normally show up when you list object with ls()
. To force ls
to show these variables, use ls(all.names=TRUE)
. By using a dot as first letter of a variable, you change the scope of the variable itself. For example:
x <- 3
.x <- 4
ls()
[1] "x"
ls(all.names=TRUE)
[1] ".x" "x"
x
[1] 3
.x
[1] 4
4. Other possible reasons
In Hadley's plyr package, he uses the convention to use leading dots in function names. This as a mechanism to try and ensure that when resolving variable names, the values resolve to the user variables rather than internal function variables.
Complications
This mishmash of different uses can lead to very confusing situations, because these different uses can all get mixed up in the same function name.
For example, to convert a data.frame
to a list you use as.list(..)
as.list(iris)
In this case as.list
is a S3 generic method, and you are passing a data.frame
to it. Thus the S3 function is called as.list.data.frame
:
> as.list.data.frame
function (x, ...)
{
x <- unclass(x)
attr(x, "row.names") <- NULL
x
}
<environment: namespace:base>
And for something truly spectacular, load the data.table
package and look at the function as.data.table.data.frame
:
> library(data.table)
> methods(as.data.table)
[1] as.data.table.data.frame* as.data.table.data.table* as.data.table.matrix*
Non-visible functions are asterisked
> data.table:::as.data.table.data.frame
function (x, keep.rownames = FALSE)
{
if (keep.rownames)
return(data.table(rn = rownames(x), x, keep.rownames = FALSE))
attr(x, "row.names") = .set_row_names(nrow(x))
class(x) = c("data.table", "data.frame")
x
}
<environment: namespace:data.table>
Are there any official naming conventions for R?
The R Developer Page contains "more or less finalized ideas and plans for the R statistical system" from R-core. It does not contain any information about naming conventions. A brief look at the core R code will confirm this.
What is your preferred style for naming variables in R?
Good previous answers so just a little to add here:
underscores are really annoying for ESS users; given that ESS is pretty widely used you won't see many underscores in code authored by ESS users (and that set includes a bunch of R Core as well as CRAN authors, excptions like Hadley notwithstanding);
dots are evil too because they can get mixed up in simple method dispatch; I believe I once read comments to this effect on one of the R list: dots are a historical artifact and no longer encouraged;
so we have a clear winner still standing in the last round: camelCase. I am also not sure if I really agree with the assertion of 'lacking precendent in the R community'.
And yes: pragmatism and consistency trump dogma. So whatever works and is used by colleagues and co-authors. After all, we still have white-space and braces to argue about :)
Related Topics
Subset Observations That Differ by at Least 30 Minutes Time
Making a Zip Code Choropleth in R Using Ggplot2 and Ggmap
How to Calculate the Area of Polygon Overlap in R
Formatting Number Output of Sliderinput in Shiny
R: Replace Na with Item from Vector
How to Remove All Rows from a Data.Frame
Extracting Coefficients and Their Standard Error for Each Unit in an Lme Model Fit
Knitr: How to Use Child .Rnw Docs with (Relative) Figure Paths
Plotting Functions on Top of Datapoints in R
If_Else() 'False' Must Be Type Double, Not Integer - in R
Ggplot2 Bar Plot with Two Categorical Variables
R How to Change One of the Level to Na
Plotting Dose Response Curves with Ggplot2 and Drc
Unquote the Variable Name on the Right Side of Mutate Function in Dplyr