How to Extract Coefficients' Standard Error from an "Aov" Model

How to extract coefficients' standard error from an aov model

Use lm method for summary:

coef(summary.lm(model))

This will give a coefficient table / matrix of 4 columns (mean, standard error, t-value, p-value) for all identifiable coefficients. Then you can extract the 2nd column for standard error.

aov returns object of primary class "aov" but secondary class "lm", hence both summary.aov and summary.lm apply but gives different things. When you simply do summary(model), the former is called as the result of S3 method dispatching.

ldply extract standard deviation from coefficients (models)

The issue is with ldply(models, coef(summary(models)), inside the ldply, you are looping through models and the function needs to work on each element of the list. You need to write a function that does summary first, followed by coef, see

library(dlply)

linmod <- function(df) {
lm(rbi ~ year, data = mutate(df, year = year - min(year)))
}

models <- dlply(baseball, .(id), linmod)
mod_coef<-ldply(models, coef)
results <- ldply(models,function(i)coef(summary(i)))

> head(results)
id Estimate Std. Error t value Pr(>|t|)
1 aaronha01 118.923913043 9.44994928 12.58460860 3.013683e-11
2 aaronha01 -1.732213439 0.73567755 -2.35458242 2.835224e-02
3 abernte02 0.554009613 0.44022430 1.25847122 2.274573e-01
4 abernte02 -0.002403238 0.03829954 -0.06274848 9.507953e-01
5 adairje01 18.831034483 11.77420551 1.59934651 1.337547e-01
6 adairje01 0.879310345 1.61731151 0.54368645 5.958584e-01

# to get standard error
> head(results[,"Std. Error"])
[1] 9.44994928 0.73567755 0.44022430 0.03829954 11.77420551 1.61731151

Extract p-value from aov

Here:

summary(test)[[1]][["Pr(>F)"]][1]

Steps after running a 3 way ANCOVA with 1 continuous covariate

A reproducible example based on your description:

dat <- data.frame(y = rnorm(150), x = rnorm(150),
f1 = sample(gl(3, 50, labels = letters[1:3])),
f2 = sample(gl(3, 50, labels = letters[1:3])),
f3 = sample(gl(3, 50, labels = letters[1:3])))

model <- lm(y ~ x + f1 * f2 * f3, data = dat)

## ?
summary(model)

## ?
summary.aov(model)


I can't remember the reason to run both lines of code.

You get different summary statistics from the two summary functions.

summary(model) is in fact summary.lm(model), because model is fitted using lm. This gives you t-statistic for individual fitted coefficient.

summary.aov(model) is effectively doing anova(model), giving you an ANOVA table with F-statistic. This is most useful here because you have interactions between factor variables.


Note that you may also fit your model using aov.

MODEL <- aov(y ~ x + f1 * f2 * f3, data = dat)

Now summary(MODEL) is in fact summary.aov(MODEL), because MODEL is fitted by aov.

## ANOVA table, as same as summary.aov(model)
summary(MODEL)

If you want to get t-statistics, you need to use

## as same as summary(model)
summary.lm(MODEL)

Confusing? Well, bear in mind that summary() is a generic function with different methods. Obviously, the "lm" and "aov" methods produce different things.

Extensive Reading:

  • How to extract coefficients' standard error from an "aov" model

  • When should I use aov() and when anova()?


Reply

I was getting so worked up trying to figure out why the output was different and the significance of the predictors was different between the two functions, then I realized the same thing after taking a break! Thank you so much.

There was also little explanation in my notes about lm vs aov, so I appreciate it.

lm and aov perform the same computations internally, but prioritize different summary output when you simply call summary(). The explicit usage of summary.lm() or summary.aov() has a clear orientation.

Is there a best way to move forward now knowing that the interaction between f2 and f3 is significant?

Usually we want to see if we can simplify our model structure. At the moment f1 * f2 * f3 includes all possible interactions, namely f1, f2, f3, f1:f2, f1:f3, f2:f3 and f1:f2:f3. Perhaps some of them can be dropped. But since my example data are just random values, the reported significance is pointless. If you need particular advice for your meaningful data, you can post a question on https://stats.stackexchange.com/.

pull out p-values and r-squared from a linear regression

r-squared: You can return the r-squared value directly from the summary object summary(fit)$r.squared. See names(summary(fit)) for a list of all the items you can extract directly.

Model p-value: If you want to obtain the p-value of the overall regression model,
this blog post outlines a function to return the p-value:

lmp <- function (modelobject) {
if (class(modelobject) != "lm") stop("Not an object of class 'lm' ")
f <- summary(modelobject)$fstatistic
p <- pf(f[1],f[2],f[3],lower.tail=F)
attributes(p) <- NULL
return(p)
}

> lmp(fit)
[1] 1.622665e-05

In the case of a simple regression with one predictor, the model p-value and the p-value for the coefficient will be the same.

Coefficient p-values: If you have more than one predictor, then the above will return the model p-value, and the p-value for coefficients can be extracted using:

summary(fit)$coefficients[,4]  

Alternatively, you can grab the p-value of coefficients from the anova(fit) object in a similar fashion to the summary object above.

Two-way ANOVA with means and standard deviation

Yes, you can use model.tables(). Try ?model.tables to get acquainted with it. What you want is probably (I’ll use an example):

aov_out <- aov(yield ~ N*P, npk)

model.tables(aov_out, "means") # This will give you the means.
model.tables(aov_out, se = TRUE) # Will give you standard errors for the effects.
model.tables(aov_out, "means", se = TRUE) # Will give you the standard error for the differences of means.

Note however, that the last command only works as long as you don’t have any random effects in your model. Hence, for a model like:

aov_out_ran <- aov(yield ~ N*P + Error(block), npk) 

the last command will not work as it is not yet implemented. But you will see that in a warning message anyway.

You can also just compute the means easily by hand. Refit your model with contrast coding:

aov_out_contr <- aov(yield ~ N*P, npk, contrasts = list (N = "contr.sum", P = "contr.sum"))

Use coef() to get the coefficients of the model:

coef_aov_out <- coef(aov_out)

Because we used contrast coding the (Intercept) at position coef_aov_out[1] will be the grand mean and the further coefficients in coef_aov_out will be the effects of the main effects or interaction effects that need to be substracted or added to the grand mean in order to get the group specific means. You can compute them by hand like this:

# Compute mean of N0:
N0 <- coef_aov_out[1]+coef_aov_out[2]

# Compute mean of N1:
N1 <- coef_aov_out[1]-coef_aov_out[2]

# Compute mean of P0:
P0 <- coef_aov_out[1]+coef_aov_out[3]

# Compute mean of P1:
P1 <- coef_aov_out[1]-coef_aov_out[3]

# Compute mean of N0xP0:
NOP0 <- coef_aov_out[1]+coef_aov_out[2]+coef_aov_out[3]+coef_aov_out[4]

# Compute mean of N0xP1:
N0P1 <- coef_aov_out[1]+coef_aov_out[2]-coef_aov_out[3]-coef_aov_out[4]

# Compute mean of N1xP0:
N1P0 <- coef_aov_out[1]-coef_aov_out[2]+coef_aov_out[3]-coef_aov_out[4]

# Compute mean of N1xP1:
N1P1 <- coef_aov_out[1]-coef_aov_out[2]-coef_aov_out[3]+coef_aov_out[4]

You can compare the results with model.tables(aov_out_contr, "means"). It’s a good exercise to understand what’s going on.

extract values from time series forecast

If an object has multiple classes, it can be processed as any one class (although the generic function normally picks up the first class for method dispatching, see How to extract coefficients' standard error from an "aov" model for an example).

A "mts" class is for a multivariate time series, it is a mixture of "matrix" class and "ts" class. You can access it as an ordinary matrix.

forecast[1, , drop = FALSE]  ## row 1

fit <- forecast[1, 1]
upr <- forecast[1, 2]
lwr <- forecast[1, 3]


Related Topics



Leave a reply



Submit