How to properly use functions from other packages in a R package
The basic question you need to answer is: "do you want the function to be available to all users of the package without further effort?". If yes, then use imports + the appropriate namespace declarations, if no, then use suggests and print an informative error message if require("psych")
returns FALSE
.
I don't understand your import related complaint that: "but on a computer that does not has psych installed it gives an error when loading my package". This is also true if your package is in depends!
Using functions from other packages - when to use package::function?
If you're actually creating an R package (as opposed to a script to source, R Project, or other method), you should NEVER use library()
or require()
. This is not an alternative to using package::function()
. You are essentially choosing between package::function()
and function()
, which as highlighted by @Bernhard, explicitly calling the package ensures consistency if there are conflicting names in two or more packages.
Rather than require(package)
, you need to worry about properly defining your DESCRIPTION and NAMESPACE files. There's many posts about that on SO and elsewhere, so won't go into details, see here for example.
Using package::function()
can help with above if you are using roxygen2
to generate your package documentation (it will automatically generate a proper NAMESPACE file.
Is it a good practice to call functions in a package via ::
It all depends on context.
::
is primarily necessary if there are namespace collisions, functions from different packages with the same name. When I load the dplyr
package, it provides a function filter
, which collides with (and masks) the filter
function loaded by default in the stats
package. So if I want to use the stats
version of the function after loading dplyr
, I'll need to call it with stats::filter
.
This also gives motivation for not loading lots of packages. If you really only want one function from a package, it can be better to use ::
than load the whole package, especially if you know the package will mask other functions you want to use.
Not in code, but in text, I do find ::
very useful. It's much more concise to type stats::filter
than "the filter
function from the stats
package".
From a performance perspective, there is a (very) small price for using ::
. Long-time R-Core development team member Martin Maechler wrote (on the r-devel mailing list (Sept 2017))
Many people seem to forget that every use of
::
is an R
function call and using it is inefficient compared to just using
the already imported name.
The performance penalty is very small, on the order of a few microseconds, so it's only a concern when you need highly optimized code. Running a line of code that uses ::
one million times will take a second or two longer than code that doesn't use ::
.
As far as portability goes, it's nice to explicitly load packages at the top of a script because it makes it easy to glance at the first few lines and see what packages are needed, installing them if necessary before getting too deep in anything else, i.e., getting halfway through a long process that now can't be completed without starting over.
Aside: a similar argument can be made to prefer library()
over require()
. Library will cause an error and stop if the package isn't there, whereas require will warn but continue. If your code has a contingency plan in case the package isn't there, then by all means use if (require(package)) ...
, but if your code will fail without a package you should use library(package)
at the top so it fails early and clearly.
Within your own package
The general solution is to make your own package that imports
the other packages you need to use in the DESCRIPTION file. Those packages will be automatically installed when your package is installed, so you can use pkg::fun
internally. Or, by also importing them in the NAMESPACE
file, you can import
an entire package or selectively importFrom
specific functions and not need ::
. Opinions differ on this. Martin Maechler (same r-devel source as above) says:
Personally I've got the impression that :: is
much "overused" nowadays, notably in packages where I'd strongly
advocate using importFrom() in NAMESPACE, so all this happens
at package load time, and then not using::
in the package
sources itself.
On the other hand, RStudio Chief Scientist Hadley Wickham says in his R Packages book:
It's common for packages to be listed in
Imports
inDESCRIPTION
, but not inNAMESPACE
. In fact, this is what I recommend: list the package inDESCRIPTION
so that it’s installed, then always refer to it explicitly withpkg::fun()
. Unless there is a strong reason not to, it's better to be explicit.
With two esteemed R experts giving opposite recommendations, I think it's fair to say that you should pick whichever style suits you best and meets your needs for clarity, efficiency, and maintainability.
If you frequently find yourself using just one function from another package, you can copy the code and add it to your own package. For example, I have a package for personal use that borrows %nin%
from the Hmisc
package because I think it's a great function, but I don't often use anything else from Hmisc
. With roxygen2
, it's easy to add @author
and @references
to properly attribute the code for a borrowed function. Also make sure the package licenses are compatible when doing this.
R: 2 functions with the same name in 2 different packages
You have probably already noticed that the order of loading the packages makes a difference, i.e. the package that gets loaded last will mask the functions in packages loaded earlier.
To specify the package that you want to use, the syntax is:
chron::is.weekend()
tseries::is.weekend()
In other words, use packagename::functionname()
In addition, if you know that you will always want to use the function in chron, you can define your own function as follows:
is.weekend <- chron::is.weekend #EDIT
Calling functions from within R packages in Python using importr
You're only importing the base module, and need to import it entirely. You'd think Python would do that automatically, apparently it doesn't. See this SO answer.
from mRMRr import *
from datasets import *
Edit: Ah, yeah that applies to explicit python modules. I think the syntax of calling on functions of sub-packages is possibly different. Try this.
import rpy2.robjects.packages as packages
datasets = packages.importr('datasets')
mtcars = packages.data(datasets).fetch('mtcars')['mtcars']
Adding and Editing Functions of R-package
Solution: The following steps worked for me (on a Mac) to simultaneously solve both aspects:
- I downloaded package ABC as a tar file from the CRAN repository (file: "ABC_1.1-2.tar"). After opening the file by decompressing it via double-click, it shows the typical structure of Packages (metadata, vignettes, namespace, etc.) as described in the link provided by alistaire (see here - very helpful, many thanks).
- All the relevant files with the different algorithms (e.g. files "algo-A.R", "algo-B.R") are contained in the "R"-folder and inside the file "algo-A.R", I found the function
ex_fct
. I opened this file in R-Studio adjusted theex_fct
function as desired and added thenw_fct
also to that file (because theex_fct
andnw_fct
functions are related) and saved it under the same name, i.e. as "algo-A.R". As a result, I now have an updated package folder that contains my updated version of the "algo-A.R" file. - Finally, I used the
build
function of thedevtools
package to create a single bundled ".tar" file (say file "ABC_new.tar") from this updated package folder. Specifically, one may simply use:build(pkg= "path1/ABC_1.1-2", path= "~path2/ABC_new.tar", manual=F, binary=F)
, wherepath1
navigates to the place of the updated package folder andpath2
says where the bundled package shall be stored. Notice: As I did this on a new machine, this step did not work immediately but required to first install e.g. TeXLive, Java Applications as well as several packages required by the ABC package (simply follow R's error commands). - Finally, I was able to (permanently) install the updated package archive file in RStudio via:
install.packages(“~path2/ABC_new.tar", repos = NULL, type=“source”)
Should you wish to undo these changes and have the original package again, you may simply remove the package and re-install the original one from CRAN.
Modify package function
I finally found a solution that should work in all situations!
environment(customGenomePlot) <- asNamespace('snapCGH')
assignInNamespace("genomePlot", customGenomePlot, ns = "snapCGH")
The call to environment()
assures that the function will be able to call other hidden functions from the package.
The call to assignInNamespace()
assures that other functions from the package will call your updated version of the function.
It is possible that in certain situation, you need only one of these, but in general you need both. I struggled to find this general solution, found many other which are not working in some cases, like this (need opposite order), or this (misses the second part), or this (throws the error "cannot add bindings to a locked environment").
Include library calls in functions?
As one of the commenters suggest, you should avoid loading packages within a function since
- The function now has a global effect - as a general rule, this is something to avoid.
- There is a very small performance hit.
The first point is the big one. As with most optimisation, only worry about the second point if it's an issue.
Now that we've established the principle, what are the possible solution.
In small projects, I have a file called
packages.R
that includes all thelibrary
calls I need. This is sourced at the top of my analysis script. BTW, all my functions are in a file callfunc.R
. This workflow was stolen/adapted from a previous SO questionIf you're only importing a single function, you could use the
::
trick, e.g.package::funcA(...)
That way you avoid loading the package.For larger projects, I create an R package that handles all necessary imports. The benefit of creating a package is detailed in this answer on structuring large R projects.
Related Topics
Filter Based on Number of Distinct Values Per Group
Customizing the Sankey Chart to Cater Large Datasets
How to Learn How to Write C Code to Speed Up Slow R Functions
How to Organize Large Shiny Apps
What Techniques Exists in R to Visualize a "Distance Matrix"
R Random Forests Variable Importance
Move a Column to First Position in a Data Frame
Define All Functions in One .R File, Call Them from Another .R File. How, If Possible
What Best Practices Do You Use for Programming in R
What Is the Knitr Equivalent of 'R Cmd Sweave Myfile.Rnw'
How to Change the Default Font Size in Ggplot2
"Un-Register" a Doparallel Cluster
Ordering Permutation in Rcpp I.E. Base::Order()
Pie Plot Getting Its Text on Top of Each Other
Installing Rmysql in Mavericks
R - How to Test for Character(0) in If Statement
How to Install Dependencies When Using "R Cmd Install" to Install R Packages