Is there a way to 'compress' an lm() object for later prediction?
You can use biglm
to fit your models, a biglm
model object is smaller than a lm model object. You can use predict.biglm
create a function that you can pass the newdata design matrix to, which returns the predicted values.
Another option is to use saveRDS
to save the files, which appear to be slightly smaller, as they have less overhead, being a single object, not like save which can save multiple objects.
library(biglm)
m <- lm(log(Volume)~log(Girth)+log(Height), trees)
mm <- lm(log(Volume)~log(Girth)+log(Height), trees, model = FALSE, x =FALSE, y = FALSE)
bm <- biglm(log(Volume)~log(Girth)+log(Height), trees)
pred <- predict(bm, make.function = TRUE)
save(m, file = 'm.rdata')
save(mm, file = 'mm.rdata')
save(bm, file = 'bm.rdata')
save(pred, file = 'pred.rdata')
saveRDS(m, file = 'm.rds')
saveRDS(mm, file = 'mm.rds')
saveRDS(bm, file = 'bm.rds')
saveRDS(pred, file = 'pred.rds')
file.info(paste(rep(c('m','mm','bm','pred'),each=2) ,c('.rdata','.rds'),sep=''))
# size isdir mode mtime ctime atime exe
# m.rdata 2806 FALSE 666 2013-03-07 11:29:30 2013-03-07 11:24:23 2013-03-07 11:29:30 no
# m.rds 2798 FALSE 666 2013-03-07 11:29:30 2013-03-07 11:29:30 2013-03-07 11:29:30 no
# mm.rdata 2113 FALSE 666 2013-03-07 11:29:30 2013-03-07 11:24:28 2013-03-07 11:29:30 no
# mm.rds 2102 FALSE 666 2013-03-07 11:29:30 2013-03-07 11:29:30 2013-03-07 11:29:30 no
# bm.rdata 592 FALSE 666 2013-03-07 11:29:30 2013-03-07 11:24:34 2013-03-07 11:29:30 no
# bm.rds 583 FALSE 666 2013-03-07 11:29:30 2013-03-07 11:29:30 2013-03-07 11:29:30 no
# pred.rdata 1007 FALSE 666 2013-03-07 11:29:30 2013-03-07 11:24:40 2013-03-07 11:29:30 no
# pred.rds 995 FALSE 666 2013-03-07 11:29:30 2013-03-07 11:27:30 2013-03-07 11:29:30 no
How to predict with with nlme::predict.lme without calling the whole 'object'
The lme
objects, as with any class, are designed to contain everything they may need for any function that has been written to be called on it. If you want to just use the bare bones you will need to pull out only what you need and reassign the class so the correct S3 method is called. To see which components you need, you would have to look at the source nlme:::predict.lme
. Here is an example with the Orthodont
dataset.
library(nlme)
data(Orthodont)
# Just fit a model
fm1 <- lme(distance ~ age, data = Orthodont)
# pull out the minimal components needed for prediction
min_fm1 <- list(modelStruct = fm1$modelStruct,
dims = fm1$dims,
contrasts = fm1$contrasts,
coefficients = fm1$coefficients,
groups = fm1$groups,
call = fm1$call,
terms = fm1$terms)
# assign class otherwise the default predict method would be called
class(min_fm1) <- "lme"
# By dropping this like fm1$data you trim it down quite a bit
object.size(fm1)
63880 bytes
object.size(min_fm1)
22992 bytes
# make sure output identical
identical(predict(min_fm1, Orthodont, level = 0, na.action = na.omit),
predict(fm1, Orthodont, level = 0, na.action = na.omit))
[1] TRUE
Extract prediction function only from lm() call
First, we borrow a function from this other question that reduces the size of the lm
object.
clean_model = function(cm) {
# just in case we forgot to set
# y=FALSE and model=FALSE
cm$y = c()
cm$model = c()
cm$residuals = c()
cm$fitted.values = c()
cm$effects = c()
cm$qr$qr = c()
cm$linear.predictors = c()
cm$weights = c()
cm$prior.weights = c()
cm$data = c()
# also try and avoid some large environments
attr(cm$terms,".Environment") = c()
attr(cm$formula,".Environment") = c()
cm
}
Then write a simple wrapper that reduces the model and returns the prediction function:
prediction_function <- function(model) {
stopifnot(inherits(model, 'lm'))
model <- clean_model(model)
function (...) predict(model, ...)
}
Example:
set.seed(1234)
df <- data.frame(x = 1:9, y = 2 * 1:9 + 3 + rnorm(9, sd = 0.5))
fit <- lm(y ~ x, df)
f <- prediction_function(fit)
f(data.frame(x = 5:6))
1 2
12.83658 14.83351
Check sizes:
object.size(fit)
# 16648 bytes
object.size(prediction_function)
# 8608 bytes
For this small example we save half the space.
Let's use some larger data:
data(diamonds, package = 'ggplot2')
fit2 <- lm(carat ~ price, diamonds)
predict(fit2, data.frame(price = 200))
f2 <- prediction_function(fit2)
f2(data.frame(price = 200))
print(object.size(fit2), units = 'Mb');
object.size(f2)
Now we go from 13 Mb to 5376 bytes.
Related Topics
Count Unique Combinations of Values
Control Speed of a Gganimation
Download Plotly Using Downloadhandler
Create a Histogram for Weighted Values
Create a Reactive Function Outside the Shiny App
Meaning of Tilde and Dot Notation in Dplyr
Testing a Function That Uses Enquo() for a Null Parameter
Replacing White Space with One Single Backslash
General Guide for Creating Publication Quality Tables Using R, Sweave, and Latex
Plot a Jpg Image Using Base Graphics in R
Mutate Multiple/Consecutive Columns (With Dplyr or Base R)
How to Rbind Only the Common Columns of Two Data Sets