Extract standard errors from lm object
The output of from the summary
function is just an R list. So you can use all the standard list operations. For example:
#some data (taken from Roland's example)
x = c(1,2,3,4)
y = c(2.1,3.9,6.3,7.8)
#fitting a linear model
fit = lm(y~x)
m = summary(fit)
The m
object or list has a number of attributes. You can access them using the bracket or named approach:
m$sigma
m[[6]]
A handy function to know about is, str
. This function provides a summary of the objects attributes, i.e.
str(m)
Obtain standard errors of regression coefficients for an mlm object returned by `lm()`
If you put your data in long format it's very easy to get a bunch of regression results using lmList
from the nlme or lme4 packages. The output is a list of regression results and the summary can give you a matrix of coefficients, just like you wanted.
library(lme4)
m <- lmList( y ~ x | group, data = dat)
summary(m)$coefficients
Those coefficients are in a simple 3 dimensional array so the standard errors are at [,2,2]
.
Extract all standard errors of coefficients from list of logistic regressions
Using an example from ?glm
## Dobson (1990) Page 93: Randomized Controlled Trial :
counts <- c(18,17,15,20,10,20,25,13,12)
outcome <- gl(3,1,9)
treatment <- gl(3,3)
glm.D93 <- glm(counts ~ outcome + treatment, family=poisson())
## copy twice to a list to illustrate
lmod <- list(mod1 = glm.D93, mod2 = glm.D93)
Then we could compute them as summary()
would, or extract them after calling summary()
. The former is far more efficient as you only compute what you want. The latter doesn't rely on knowing how the standard errors are derived.
Compute the standard errors directly
The standard errors can be computed from the variance-covariance matrix of the model. The diagonal of this matrix contains the variances of the coefficients, and the standard errors are simply the square root of these variances. The vcov()
extractor function gets the variance-covariance matrix for us and we square root the diagonals with sqrt(diag())
:
> lapply(lmod, function(x) sqrt(diag(vcov(x))))
$mod1
(Intercept) outcome2 outcome3 treatment2 treatment3
0.1708987 0.2021708 0.1927423 0.2000000 0.2000000
$mod2
(Intercept) outcome2 outcome3 treatment2 treatment3
0.1708987 0.2021708 0.1927423 0.2000000 0.2000000
Extract them from a call to summary()
Or we can let summary()
compute the standard errors (and a lot more), then use lapply()
or sapply()
to apply an anonymous function that extracts coef(summary(x))
and takes the second column (in which the standard errors are stored).
lapply(lmod, function(x) coef(summary(x))[,2])
Which gives
> lapply(lmod, function(x) coef(summary(x))[,2])
$mod1
(Intercept) outcome2 outcome3 treatment2 treatment3
0.1708987 0.2021708 0.1927423 0.2000000 0.2000000
$mod2
(Intercept) outcome2 outcome3 treatment2 treatment3
0.1708987 0.2021708 0.1927423 0.2000000 0.2000000
whereas sapply()
would give:
> sapply(lmod, function(x) coef(summary(x))[,2])
mod1 mod2
(Intercept) 0.1708987 0.1708987
outcome2 0.2021708 0.2021708
outcome3 0.1927423 0.1927423
treatment2 0.2000000 0.2000000
treatment3 0.2000000 0.2000000
Depending on what you wanted to do , you could extract both the coefficients and the standard errors with a single call:
> lapply(lmod, function(x) coef(summary(x))[,1:2])
$mod1
Estimate Std. Error
(Intercept) 3.044522e+00 0.1708987
outcome2 -4.542553e-01 0.2021708
outcome3 -2.929871e-01 0.1927423
treatment2 1.337909e-15 0.2000000
treatment3 1.421085e-15 0.2000000
$mod2
Estimate Std. Error
(Intercept) 3.044522e+00 0.1708987
outcome2 -4.542553e-01 0.2021708
outcome3 -2.929871e-01 0.1927423
treatment2 1.337909e-15 0.2000000
treatment3 1.421085e-15 0.2000000
But you might prefer them separately?
How do I extract estimates and standard errors as a measure of linear increment from an lm model in R?
Basically, your lm
model is of the formula y = Intercept + x*coefficient
.
So, you can calculate the estimate
based on the output of the summary(lm(...
So, if you take the following example:
set.seed(123)
vector1 = rnorm(100, mean = 4)
vector2 = rnorm(100, mean = 1)
dat = data.frame(vector1,vector2)
model_dat = lm(vector2 ~ vector1, data = dat)
Model = summary(model_dat)
And now, you can calculate the estimate:
dat$estimate = dat$vector1 * Model$coefficients[2,1] + Model$coefficients[1,1]
And then for the standard error, you can use predict.lm
with the function se.fit = TRUE
:
dat$SE = predict.lm(model_dat, se.fit = TRUE, level = 0.95)$se.fit
So, you get the following dataset:
> head(dat)
vector1 vector2 estimate SE
1 3.439524 0.28959344 0.9266060 0.11942447
2 3.769823 1.25688371 0.9092747 0.10294104
3 5.558708 0.75330812 0.8154090 0.18452625
4 4.070508 0.65245740 0.8934973 0.09709476
5 4.129288 0.04838143 0.8904130 0.09716038
6 5.715065 0.95497228 0.8072047 0.19893259
You can compare the result of this by first, checking the plotting obtained using stat_smooth
:
library(ggplot2)
ggplot(dat, aes(x = vector1, y = vector2)) + geom_point() + stat_smooth(method = "lm", se = TRUE)
And you get this plot:
And if now, you use estimate
and SE
columns from your dat
:
ggplot(dat, aes(x = vector1, y = vector2)) + geom_point() +
geom_line(aes(x = vector1, y = estimate), color = "red")+
geom_line(aes(x = vector1, y = estimate+SE)) +
geom_line(aes(x = vector1, y = estimate-SE))
You get almost the same plot:
Hope that it answers your question
ldply extract standard deviation from coefficients (models)
The issue is with ldply(models, coef(summary(models)), inside the ldply, you are looping through models and the function needs to work on each element of the list. You need to write a function that does summary first, followed by coef, see
library(dlply)
linmod <- function(df) {
lm(rbi ~ year, data = mutate(df, year = year - min(year)))
}
models <- dlply(baseball, .(id), linmod)
mod_coef<-ldply(models, coef)
results <- ldply(models,function(i)coef(summary(i)))
> head(results)
id Estimate Std. Error t value Pr(>|t|)
1 aaronha01 118.923913043 9.44994928 12.58460860 3.013683e-11
2 aaronha01 -1.732213439 0.73567755 -2.35458242 2.835224e-02
3 abernte02 0.554009613 0.44022430 1.25847122 2.274573e-01
4 abernte02 -0.002403238 0.03829954 -0.06274848 9.507953e-01
5 adairje01 18.831034483 11.77420551 1.59934651 1.337547e-01
6 adairje01 0.879310345 1.61731151 0.54368645 5.958584e-01
# to get standard error
> head(results[,"Std. Error"])
[1] 9.44994928 0.73567755 0.44022430 0.03829954 11.77420551 1.61731151
Related Topics
Daily Time Series with Ts.. How to Specify Start and End
How to Display the Median Value in a Boxplot in Ggplot
One-Class Classification with Svm in R
Fastest Way to Read in 100,000 .Dat.Gz Files
Can Ggplot Theme Formatting Be Saved as an Object
Plots with Good Resolution for Printing and Screen Display
Create Sections Through a Loop with Knitr
How Can R Loop Over Data Frames
R Library for Discrete Markov Chain Simulation
Join Data.Table on Exact Date or If Not the Case on the Nearest Less Than Date
Caching the Mean of a Vector in R
What's the Difference in Using a Semicolon or Explicit New Line in R Code
Meaning of Objects Being Masked by the Global Environment
Using a Static (Prebuilt) PDF Vignette in R Package
Glpk: No Such File or Directory Error When Trying to Install R Package