How can I tell which packages I am not using in my R script?
Update 2020-04-13
I've now updated the referenced function to use the abstract syntax tree (AST) instead of using regular expressions as before. This is a much more robust way of approaching the problem (it's still not completely ironclad). This is available from version 0.2.0 of funchir
, now on CRAN.
I've just got around to writing a quick-and-dirty function to handle this which I call stale_package_check
, and I've added it to my package (funchir
).
e.g., if we save the following script as test.R:
library(data.table)
library(iotools)
DT = data.table(a = 1:3)
Then (from the directory with that script) run funchir::stale_package_check('test.R')
, we'll get:
Functions matched from package data.table: data.table
**No exported functions matched from iotools**
How can I see what are packages being used for in an R script, and which packages are currently not used?
I have been looking for a clear answer to this and finally, building on the useful function pointed out by @eh21 here, I built up this small approach that fits the intention with 3 lines of code and that can be replicated by anyone (and with this I mean by non-experienced programmes like me) on their case with no effort.
The principle is to use this approach after the packages have been loaded and before the actual project code (i.e. no need for it to be run in order to get the desired information), as below:
# Load packages ----
packageload <- c("ggplot2", "readxl")
lapply(packageload, library, character.only = TRUE)
# Find which packages do used functions belong to ----
used.functions <- NCmisc::list.functions.in.file(filename = "thisfile.R", alphabetic = FALSE) |> print()
# Find which loaded packages are not used ----
used.packages <- used.functions |> names() |> grep(pattern = "packages:", value = TRUE) |> gsub(pattern = "package:", replacement = "") |> print()
unused.packages <- packageload[!(packageload %in% used.packages)] |> print()
# Actual project code (no need to be run) ----
ggplot(diamonds, aes(x = cut)) +
geom_bar()
The relevant outputs are:
> used.packages
[1] "base" "ggplot2"
> used.functions
$`character(0)`
[1] "list.functions.in.file"
$`package:base`
[1] "c" "lapply" "print" "names" "grep" "gsub"
$`package:ggplot2`
[1] "ggplot" "aes" "geom_bar"
> unused.packages
[1] "readxl"
Notes:
- This requires
install.packages("NCmisc")
, however I didn't load that package (and used::
instead) for consistency, as it shouldn't appear among theused.packages
; - if using RStudio and wanting to apply this to multiple scripts, using
rstudioapi::getSourceEditorContext()$path
instead of"thisfile.R"
inNCmisc::list.functions.in.file
will be handy. - The approach above works for the case in which
lapply()
is used on a named object to load packages. If packages are instead loaded without resorting to a named object (e.g. with a series oflibrary()
orrequire()
), the # Load packages ---- section of the code above can be modified as follows:
# Load packages ----
packageload <- search()
library(ggplot2)
library(readxl)
packageload <- search()[!(search() %in% packageload)] |> grep(pattern = "package:", value = TRUE) |> gsub(pattern = "package:", replacement = "")
determine which packages are used
An answer based on ideas in the question comments. The key functions are getParseData()
and packageName()
.
# create an R file that uses a few functions
fileConn<-file("test.R")
writeLines(c("df <- data.frame(v1=c(1, 1, 1), v2=c(1, 2, 3))",
"\n",
"m <- mean(df$v2)",
"\n",
"describe(df) #psych package"),
fileConn)
close(fileConn)
# getParseData approach
pkg <- getParseData(parse("test.R"))
pkg <- pkg[pkg$token=="SYMBOL_FUNCTION_CALL",]
pkg <- pkg[!duplicated(pkg$text),]
pkgname <- pkg$text
pkgname
# [1] "data.frame" "c" "mean" "describe"
# load all probable packages first
pkgList <- list(pkgname)
for (i in 1:length(pkgname)) {
try(print(packageName(environment(get(pkgList[[1]][i])))))
}
#[1] "base"
#Error in packageName(environment(get(pkgList[[1]][i]))) :
# 'env' must be an environment
#[1] "base"
#[1] "psych"
I'll mark this as correct for now, but happy to consider other solutions.
How to find out which package version is loaded in R?
You can use sessionInfo()
to accomplish that.
> sessionInfo()
R version 2.15.0 (2012-03-30)
Platform: x86_64-pc-linux-gnu (64-bit)
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 LC_PAPER=C LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] graphics grDevices utils datasets stats grid methods base
other attached packages:
[1] ggplot2_0.9.0 reshape2_1.2.1 plyr_1.7.1
loaded via a namespace (and not attached):
[1] colorspace_1.1-1 dichromat_1.2-4 digest_0.5.2 MASS_7.3-18 memoise_0.1 munsell_0.3
[7] proto_0.3-9.2 RColorBrewer_1.0-5 scales_0.2.0 stringr_0.6
>
However, as per comments and the answer below, there are better options
> packageVersion("snow")
[1] ‘0.3.9’
Or:
"Rmpi" %in% loadedNamespaces()
Elegant way to check for missing packages and install them?
Yes. If you have your list of packages, compare it to the output from installed.packages()[,"Package"]
and install the missing packages. Something like this:
list.of.packages <- c("ggplot2", "Rcpp")
new.packages <- list.of.packages[!(list.of.packages %in% installed.packages()[,"Package"])]
if(length(new.packages)) install.packages(new.packages)
Otherwise:
If you put your code in a package and make them dependencies, then they will automatically be installed when you install your package.
How to stop a R package from executing an .R script?
If you look at the Writing R Extensions manual for packages, it offers three basic steps: R CMD build
to create a tarball, R CMD INSTALL
to install it (not your goal here) and R CMD check
to check it during development. They all offer numerous switches to tweak the behaviour. Use those -- i.e. I often do R CMD check --no-manual --no-vignettes
to skip pdf / latex part.
And R CMD check
has the very --no-examples
flag you are looking for. I am not an active user of devtools
but I would suspect it also offers you a pass-through of those options. And, worst case, if it doesn't, just use the standard tools. (In RStudio you will find a toggle, and you can set options to the R CMD ...
calls as you would on the command-line.)
(In the narrow sense of stopping examples, I keep forgetting what is current but you can try all of \dontrun{}
, \donttest{}
, ... as well as explicit conditioning on an environment variable you set. All of that will be visible in the code and may not be what you want to show in your documentation though.)
Related Topics
Apply a Function to All Variables Starting with Specific Pattern in R
How to Install R Packages via Proxy [User + Password]
R: Interactive Plots (Tooltips): Rcharts Dimple Plot: Formatting Axis
How to Round a Date to the Quarter Start/End
Plot Table Objects with Ggplot
Highcharter Plotbands, Plotlines with Time Series Data
Adding All Elements of Two Lists
Numbered Code Chunks in Rmarkdown
Ubuntu 16.04 R Installation: Configure: Gdal-Config Not Found or Not Executable
R: Loop Over Columns in Data.Table
How to Add Colorbar with Perspective Plot in R
Remove Zombie Processes Using Parallel Package
How to Install 2 Different R Versions on Debian
How to Configure Box.Color in Directlabels "Draw.Rects"
R - Data Frame - Convert to Sparse Matrix
How to Check If Each Element in a Vector Is Integer or Not in R