What Is the Benefit of Import in a Namespace in R

What is the benefit of import in a namespace in R?

If a function foo is imported from package Bar then it is found regardless of what the user does to their search path, e.g., by attaching a package Baz that also has a function foo. Without a name space, the package code would suddenly find itself using Baz::foo. There are also efficiency issues (foo is found immediately, rather than after searching through all symbols on the search path), but these are likely to be trivial for most applications. In the same way, importFrom is an improvement over import because of fewer collisions (or use of unintended functions) and more efficient look-up.

With S4 (and S3) things can get quite complicated. A non-generic function like graphics::plot can be promoted to a generic (with setGeneric) in two different packages, and each generic can have its own set of methods attached. A package author will want to be precise about which plot generic, and hence which methods dispatch table, their classes and methods see.

Calling a function with pkg::foo always resolves to the intended function. It requires that pkg be listed in the Depends: field of the DESCRIPTION file (maybe in Imports: but then it seems like misleading advertising to not import from pkg), polluting the user's search path. It also involves two symbol look-ups and a function call (::), and so is less efficient. The lazy and lack-of-attention-to-detail part of me also sees use of :: as tedious and error prone.

The package codetoolsBioC (via svn with username and password readonly) can generate a NAMESPACE file from an existing package (or at least it could before recent changes to R-devel introduced a NAMESPACE on packages without one; I haven't tried codetoolsBioC on such a package).

import NAMESPACE and DESCRIPTION question?

I would suggest reading the Namespaces chapter of Hadley's R Packages book. But in short, the answer is No.

Are imported functions in the NAMESPACE file attached to the R session when the main package is attached?

No, they are not. Imported functions are available for use in the package internals, but not attached to the user's search tree.

Another source for info is, of course, Writing R Extensions. They describe IMPORTS as:

The ‘Imports’ field lists packages whose namespaces are imported from (as specified in the NAMESPACE file) but which do not need to be attached.


As a demonstration, the current version of ggplot2, v 3.2.1, has import(scales) in its NAMESPACE file. In a fresh R session, we can load ggplot2 and observe that the scales package is not attached:

library(ggplot2)
percent(1)
# Error in percent(1) : could not find function "percent"
scales::percent(1)
# [1] "100%"

ggplot2 uses functions from scales internally, and can do so without using the package::function notation. This is what the import(scales) accomplishes. However, unlike with Depends, scales is not attached for the user.

Better explanation of when to use Imports/Depends

"Imports" is safer than "Depends" (and also makes a package using it a 'better citizen' with respect to other packages that do use "Depends").

A "Depends" directive attempts to ensure that a function from another package is available by attaching the other package to the main search path (i.e. the list of environments returned by search()). This strategy can, however, be thwarted if another package, loaded later, places an identically named function earlier on the search path. Chambers (in SoDA) uses the example of the function "gam", which is found in both the gam and mgcv packages. If two other packages were loaded, one of them depending on gam and one depending on mgcv, the function found by calls to gam() would depend on the order in which they those two packages were attached. Not good.

An "Imports" directive should be used for any supporting package whose functions are to be placed in <imports:packageName> (searched immediately after <namespace:packageName>), instead of on the regular search path. If either one of the packages in the example above used the "Imports" mechanism (which also requires import or importFrom directives in the NAMESPACE file), matters would be improved in two ways. (1) The package would itself gain control over which mgcv function is used. (2) By keeping the main search path clear of the imported objects, it would not even potentially break the other package's dependency on the other mgcv function.

This is why using namespaces is such a good practice, why it is now enforced by CRAN, and (in particular) why using "Imports" is safer than using "Depends".


Edited to add an important caveat:

There is one unfortunately common exception to the advice above: if your package relies on a package A which itself "Depends" on another package B, your package will likely need to attach A with a "Depends directive.

This is because the functions in package A were written with the expectation that package B and its functions would be attached to the search() path.

A "Depends" directive will load and attach package A, at which point package A's own "Depends" directive will, in a chain reaction, cause package B to be loaded and attached as well. Functions in package A will then be able to find the functions in package B on which they rely.

An "Imports" directive will load but not attach package A and will neither load nor attach package B. ("Imports", after all, expects that package writers are using the namespace mechanism, and that package A will be using "Imports" to point to any functions in B that it need access to.) Calls by your functions to any functions in package A which rely on functions in package B will consequently fail.

The only two solutions are to either:

  1. Have your package attach package A using a "Depends" directive.
  2. Better in the long run, contact the maintainer of package A and ask them to do a more careful job of constructing their namespace (in the words of Martin Morgan in this related answer).

How can a non-imported method in a not-attached package be found by calls to functions not having it in their namespace?

I'm not sure if I correctly understand your question, but the main point is that group is character vector while data$group is factor.

After attaching gmodels, the call for reorder(factor) calls gdata:::reorder.factor.
so, reorder(factor(group)) calls it.

In transform, the function is evaluated within the environment of the first argument, so in T2 <- transform(data, group = reorder(group,-num)), group is factor.

UPDATED

library attaches the import packages into loaded namespace.

> loadedNamespaces()
[1] "RCurl" "base" "datasets" "devtools" "grDevices" "graphics" "methods"
[8] "stats" "tools" "utils"
> library(gmodels) # here, namespace:gdata is loaded
> loadedNamespaces()
[1] "MASS" "RCurl" "base" "datasets" "devtools" "gdata" "gmodels"
[8] "grDevices" "graphics" "gtools" "methods" "stats" "tools" "utils"

Just in case, the reorder generic exists in namespace:stats:

> r <- ls(.__S3MethodsTable__., envir = asNamespace("stats"))
> r[grep("reorder", r)]
[1] "reorder" "reorder.default" "reorder.dendrogram"

And for more details

The call of reorder will search the S3generics in two envs:

see ?UseMethod

first in the environment in which the generic function is called, and then in the registration data base for the environment in which the generic is defined (typically a namespace).

then, loadNamespace registers the S3 functions to the namespace.

So , in your case, library(gmodels) -> loadNamespace(gdata) -> registerS3Methods(gdata).

After this, you can find it by:

> methods(reorder)
[1] reorder.default* reorder.dendrogram* reorder.factor*

Non-visible functions are asterisked

However, as the reorder.factor is not attached on your search path, you cannot access it directly:

> reorder.factor
Error: object 'reorder.factor' not found

Probably this is whole scenario.

R: Added import(data.table) to NAMESPACE automatically using devtools

You probably shouldn't use import(*) at all, unless you really need every exported object from a package. Instead, use importFrom(pkg, obj1, obj2, ...) to import only those objects you need.

From the Writing R Extensions manual, S1.5.1:

Using importFrom selectively rather than import is good practice and recommended notably when importing from packages with more than a dozen exports.

Nonetheless, if you do need to import everything, use #' @import data.table.

Namespaces without packages

I’ve implemented a comprehensive solution and published it as a package, ‘box’.

Internally, ‘box’ modules uses an approach similar to packages; that is, it loads the code inside a dedicated namespace environment and then exports selected symbols into a module environment which is returned to the user, and optionally attached. The main difference to packages is that modules are more lightweight and easier to write (each R file is its own module), and can be nested.

Usage of the package is described in detail on its website.

The main use(s) of pkg::name

It avoids namespace collisions but it still has to load the pkg.

Example => I did this:

pryr::mem_used()
dplyr::filter(mtcars, cyl==4)
pryr::mem_used()

in one R instance and:

pryr::mem_used()
library(dplyr)
filter(mtcars, cyl==4)
pryr::mem_used()

in another.

mem before/after for the 1st was: 27.7 MB / 30.6 MB
mem before/after for the 2nd was: 27.7 MB / 30.7 MB

I didn't do multiple tests or see if the difference was rounding or something else, but no there were no real savings there IMO.

Imports and Depends

Couple of points, and I will admit that I also find this confusing at times. But I revisited it recently, and here is my take:

  1. "Depends" is how we used to do things; it is closest to "just loading all three":when your third depends on the other two, all three will get loaded.

  2. With Namespaces, we can also import. That brings in just the stated symbols, which can be data or functions. I use this sometimes; it will not load the other package that you import from but just make the stated symbols available. As such, it is "lighter" than Depends.

  3. If you do Depends, there is no need for Imports.

  4. That is correct: If you use declarations in in NAMESPACE to import symbols from another packages, that other package needs to be listed in Imports: in the DESCRIPTION file.

import as in R

Use the namespace package to generate another namespace aliased to the one you are interested in.

library(namespace)
registerNamespace('ggp', loadNamespace('ggplot2'))
data(iris)
ggp::ggplot(iris, ggp::aes(x = Petal.Length, y = Sepal.Length)) + ggp::geom_point()

Note this has the disadvantage of making the package versioning/installation requirements more opaque for scripts.



Related Topics



Leave a reply



Submit