Extract random effect variances from lme4 mer model object
lmer
returns an S4 object, so this should work:
remat <- summary(study)@REmat
print(remat, quote=FALSE)
Which prints:
Groups Name Variance Std.Dev.
Subject (Intercept) 1378.18 37.124
Residual 960.46 30.991
...In general, you can look at the source of the print
and summary
methods for "mer" objects:
class(study) # mer
selectMethod("print", "mer")
selectMethod("summary", "mer")
Why won't VarCorr display variance for lmerModLmerTest or glmerMod objects?
comp
is an argument to the print()
method, not to VarCorr
.
print(VarCorr(gm1), comp=c("Variance", "Std.Dev."))
Groups Name Variance Std.Dev.
herd (Intercept) 44.404 6.6636
Residual 14.513 3.8096
You might also be interested in
as.data.frame(VarCorr(gm1))[,c("vcov", "sdcor")]
vcov sdcor
1 44.40371 6.663611
2 14.51309 3.809605
Extract the confidence intervals of lmer random effects; plotted with dotplot(ranef())
rr <- ranef(fm1) ## condVar = TRUE has been the default for a while
With as.data.frame
: gives the conditional mode and SD, from which you can calculate the intervals (technically, these are not "confidence intervals" because the values of the BLUPs/conditional modes are not parameters ...)
dd <- as.data.frame(rr)
transform(dd, lwr = condval - 1.96*condsd, upr = condval + 1.96*condsd)
Or with broom.mixed::tidy
:
broom.mixed::tidy(m1, effects = "ran_vals", conf.int = TRUE)
broom.mixed::tidy()
uses as.data.frame.ranef.mer()
(the method called by as.data.frame
) internally: this function takes the rather complicated data structure described in ?lme4::ranef
and extracts the conditional modes and standard deviations in a more user-friendly format:
If ‘condVar’ is ‘TRUE’ the ‘"postVar"’
attribute is an array of dimension j by j by k (or a list of such
arrays). The kth face of this array is a positive definite
symmetric j by j matrix. If there is only one grouping factor in
the model the variance-covariance matrix for the entire random
effects vector, conditional on the estimates of the model
parameters and on the data, will be block diagonal; this j by j
matrix is the kth diagonal block. With multiple grouping factors
the faces of the ‘"postVar"’ attributes are still the diagonal
blocks of this conditional variance-covariance matrix but the
matrix itself is no longer block diagonal.
In this particular case, here's what you need to do to replicate the condsd
column of as.data.frame()
:
## get the 'postVar' attribute of the first (and only) RE term
aa <- attr(rr$Subject, "postVar")
## for each slice of the array, extract the diagonal;
## transpose and drop dimensions;
## take the square root
sqrt(c(t(apply(aa, 3, diag))))
Extract posterior estimate and credible intervals for random effect for lme4 model in R
To extract the estimate and 95% CI for your random effects, you use the following code:
sm.surv <-sim(m.surv)
#between Chick variance
bChick <-sm@ranef$Chick[,,1]
bvar<-as.vector(apply(bChick, 1, var)) #between ind variance posterior distribution
bvar<-as.mcmc(bvar)
posterior.mode(bvar) #mode of the distribution
HPDinterval(bvar)
This will then give you:
> posterior.mode(bvar)
var1
501.24353
> HPDinterval(bvar)
lower upper
var1 412.36042 630.201
attr(,"Probability")
[1] 0.95
This means that the estimate is 501 and the lower 95% interval was 412 and the upper 95% interval was 630.
Extract Fixed Effect and Random Effect in Dataframe
Use str
to see the structure of an object.
str(fixEffect)
# named vector, can probably be coerced to data.frame
View(as.data.frame(fixEffect))
# works just fine
str(randEffect)
# list of data frames (well, list of one data frame in this case)
View(randEffect$Subject)
If you had, say, slopes that also varied by Subject
, they would go in the same Subject
data frame as the Subject
level intercepts. However, if intercepts also varied by some other variable group
, with a different number of level than Subject
, they obviously couldn't go in the same data frame. This is why a list of data frames is used, so that the same structure can generalize up for more complex models.
Fixing variance values in lme4
This is possible, if a little hacky. Here's a reproducible example:
Fit the original model:
library(lme4)
set.seed(101)
ss <- sleepstudy[sample(nrow(sleepstudy),size=round(0.9*nrow(sleepstudy))),]
m1 <- lmer(Reaction~Days+(1|Subject)+(0+Days|Subject),ss)
fixef(m1)
## (Intercept) Days
## 251.55172 10.37874
Recover the deviance (in this case REML-criterion) function:
dd <- as.function(m1)
I'm going to set the standard deviations to zero so that I have something to compare with, i.e. the coefficients of the regular linear model. (The parameter vector for dd
is a vector containing the column-wise, lower-triangular, concatenated Cholesky factors for the scaled random effects terms in the model. Luckily, if all you have are scalar/intercept-only random effects (e.g. (1|x)
), then these correspond to the random-effects standard deviations, scaled by the model standard deviation).
(ff <- dd(c(0,0))) ## new REML: 1704.708
environment(dd)$pp$beta(1) ## new parameters
## [1] 251.11920 10.56979
Matches:
coef(lm(Reaction~Days,ss))
## (Intercept) Days
## 251.11920 10.56979
If you want to construct a new merMod
object you can do it as follows ...
opt <- list(par=c(0,0),fval=ff,conv=0)
lmod <- lFormula(Reaction~Days+(1|Subject)+(0+Days|Subject),ss)
m1X <- mkMerMod(environment(dd), opt, lmod$reTrms, fr = lmod$fr,
mc = quote(hacked_lmer()))
Now suppose we want to set the variances to particular non-zero value (say (700,30)). This will be a little bit tricky because of the scaling by the residual standard deviation ...
newvar <- c(700,30)
ff2 <- dd(sqrt(newvar)/sigma(m1))
opt2 <- list(par=c(0,0),fval=ff,conv=0)
m2X <- mkMerMod(environment(dd), opt, lmod$reTrms, fr = lmod$fr,
mc = quote(hacked_lmer()))
VarCorr(m2X)
unlist(VarCorr(m2X))
## Subject Subject.1
## 710.89304 30.46684
So this doesn't get us quite where we wanted (because the residual variance changes ...)
buildMM <- function(theta) {
dd <- as.function(m1)
ff <- dd(theta)
opt <- list(par=c(0,0),fval=ff,conv=0)
mm <- mkMerMod(environment(dd), opt, lmod$reTrms, fr = lmod$fr,
mc = quote(hacked_lmer()))
return(mm)
}
objfun <- function(x,target=c(700,30)) {
mm <- buildMM(sqrt(x))
return(sum((unlist(VarCorr(mm))-target)^2))
}
s0 <- c(700,30)/sigma(m1)^2
opt <- optim(fn=objfun,par=s0)
mm_final <- buildMM(sqrt(opt$par))
summary(mm_final)
## Random effects:
## Groups Name Variance Std.Dev.
## Subject (Intercept) 700 26.458
## Subject.1 Days 30 5.477
## Residual 700 26.458
## Number of obs: 162, groups: Subject, 18
##
## Fixed effects:
## Estimate Std. Error t value
## (Intercept) 251.580 7.330 34.32
## Days 10.378 1.479 7.02
By the way, it's not generally recommended to use random effects when the grouping variables have a very small number (e.g. <5 or 6) levels: see here ...
R: Modeling random intercepts from lme4 or brms objects
I don't know about lme4
or brms
, but it can be done directly in Stan. Here's a way to replicate your model, but with everything jointly estimated; see this example (in Python, not R) and this paper for more.
The model
We model the observed values of y with by-person random intercepts (one for each person j) and a scale parameter .
We model the by-person random intercepts as a function of the person's sex and the coefficient , plus a second scale parameter .
The Stan code
Here's the Stan code for the model described above. It uses a non-centered parameterization: "raw" values of have a standard normal prior, and we convert to the "real" values of using and .
data {
int<lower=0> N_obs; // number of observations
int<lower=2> N_person; // number of persons
int<lower=1,upper=N_person> person[N_obs]; // person associated with each observation
vector<lower=0,upper=1>[N_person] sex; // sex of each person
vector[N_obs] y; // observed outcomes
}
parameters {
real<lower=0> sigma; // observation-level variation
vector[N_person] alpha_person_raw; // random intercepts for persons
real mu_alpha; // mean of random intercepts for persons
real<lower=0> sigma_alpha; // scale of random intercepts for persons
real beta; // coefficients for sex
}
transformed parameters {
vector[N_person] alpha_person = (alpha_person_raw * sigma_alpha) +
mu_alpha +
(sex * beta);
}
model {
sigma ~ exponential(1);
alpha_person_raw ~ std_normal();
mu_alpha ~ std_normal();
sigma_alpha ~ exponential(1);
beta ~ std_normal();
y ~ normal(alpha_person[person], sigma);
}
Fitting the Stan model
I recreated the dataset using the same rules as in the original example, but in a slightly different format that's easier for Stan to work with.
library(tidyverse)
person.df = data.frame(person = 1:N, sex = rbinom(N, size = 1, prob = 0.5))
obs.df = data.frame(person = rep(1:N, time),
x = rnorm(N * time)) %>%
mutate(y = 2 + (x * runif(N * time)))
library(rstan)
stan.data = list(
N_obs = nrow(obs.df),
N_person = nrow(person.df),
person = obs.df$person,
sex = person.df$sex,
y = obs.df$y
)
stan.model = stan("two_level_model.stan", data = stan.data, chains = 3)
Comparing the two methods
First, I re-fit the two-stage model using the new datasets.
first.level.m = lmer(y ~ -1 + (1 | person), obs.df)
second.level.m = lm(intercept ~ 1 + sex,
person.df %>%
mutate(intercept = ranef(first.level.m)$person[["(Intercept)"]]))
Now, let's compare the estimated parameter values. The two approaches agree that the effect of sex includes 0 and that the average of the random intercepts is approximately 2. (Naturally, these estimates are influenced by the fact that x
isn't yet included as a predictor in the models.) They estimate approximately the same error term at the observation level, but the joint model estimates a substantially smaller error term for the model of the random intercepts. I'm not sure how the model-fitting procedure of lme4
relates to the exponential prior I put on in the Stan model.
library(tidybayes)
bind_rows(
spread_draws(stan.model, mu_alpha, beta, sigma_alpha, sigma) %>%
ungroup() %>%
dplyr::select(.draw, mu_alpha, beta, sigma_alpha, sigma) %>%
pivot_longer(cols = -.draw, names_to = "parameter") %>%
group_by(parameter) %>%
summarise(lower.95 = quantile(value, 0.025),
lower.50 = quantile(value, 0.25),
est = median(value),
upper.50 = quantile(value, 0.75),
upper.95 = quantile(value, 0.975)) %>%
ungroup() %>%
mutate(parameter = case_when(parameter == "mu_alpha" ~ "mu[alpha]",
parameter == "beta" ~ "beta",
parameter == "sigma_alpha" ~ "sigma[alpha]",
parameter == "sigma" ~ "sigma"),
model = "joint"),
summary(second.level.m)$coefficients %>%
data.frame() %>%
rownames_to_column("parameter") %>%
mutate(lower.95 = Estimate + (Std..Error * qt(0.025, second.level.m$df.residual)),
lower.50 = Estimate + (Std..Error * qt(0.25, second.level.m$df.residual)),
est = Estimate,
upper.50 = Estimate + (Std..Error * qt(0.75, second.level.m$df.residual)),
upper.95 = Estimate + (Std..Error * qt(0.975, second.level.m$df.residual)),
parameter = case_when(parameter == "(Intercept)" ~ "mu[alpha]",
parameter == "sex" ~ "beta"),
model = "two-stage") %>%
dplyr::select(model, parameter, est, matches("lower|upper")),
data.frame(est = summary(second.level.m)$sigma,
parameter = "sigma[alpha]",
model = "two-stage"),
data.frame(est = summary(first.level.m)$sigma,
parameter = "sigma",
model = "two-stage")
) %>%
mutate(parameter = fct_relevel(parameter, "beta", "mu[alpha]", "sigma[alpha]",
"sigma")) %>%
ggplot(aes(x = parameter, color = model)) +
geom_linerange(aes(ymin = lower.95, ymax = upper.95), size = 1,
position = position_dodge(width = 0.3)) +
geom_linerange(aes(ymin = lower.50, ymax = upper.50), size = 2,
position = position_dodge(width = 0.3)) +
geom_point(aes(y = est), size = 3, position = position_dodge(width = 0.3)) +
scale_x_discrete(labels = scales::parse_format()) +
labs(x = "Parameter", y = "Estimated parameter value", color = "Model") +
coord_flip() +
theme_bw()
Extract treatment means from an lmer object and calculate error bars
An alternative that uses functions in package languageR
. I call your data set df
.
library(lme4)
library(languageR)
library(ggplot2)
# fit model
# n.b. I don't claim that this is a sensible model
# It is just used to demonstrate the plot
mod <- lmer(DV ~ TMT1 * TMT2 + (1|Block), data = df)
# create MCMC matrix
mcmc <- pvals.fnc(mod, nsim = 1000, withMCMC = TRUE)
# pval.fnc also calculates MCMC-based p-values and HPD confidence intervals,
# and plot the posterior distributions of the parameters
# plot using plotLMER.fnc
# in addition, set withList = TRUE to create a list of data frames with plot data
# which can be used for a (possibly prettier) plot in ggplot
ll <- plotLMER.fnc(mod, withList = TRUE, pred = "TMT1",
intr = list(
"TMT2",
c("C", "D"),
"end",
list(c("red", "blue"), rep(1, 2))),
addlines = TRUE,
mcmcMat = mcmc$mcmc)
# here follows additional steps to plot using ggplot
# convert list to data frame
df <- do.call(rbind, ll$TMT1)
# rename
names(df)[names(df) == "Levels"] <- "TMT1"
# add TMT2
df$TMT2 <- rep(c("C", "D"), each = 2)
# plot using ggplot
dodge <- position_dodge(width = 0.1)
ggplot(data = df, aes(x = TMT1, y = Y, col = TMT2, group = TMT2)) +
geom_point(position = dodge, size = 3) +
geom_errorbar(aes(ymax = upper, ymin = lower, width = 0.1), position = dodge) +
geom_line(position = dodge) +
ylab("DV") +
theme_classic()
Related Topics
Compute Rolling Sum by Id Variables, with Missing Timepoints
Multinomial Logit in R: Mlogit Versus Nnet
Unimplemented Type List When Trying to Write.Table
Change the Index Number of a Dataframe
How to Make Object Created Within Function Usable Outside
Plot Mixed Effects Model in Ggplot
How to Convert Ensembl Id to Gene Symbol in R
How to Access the Data Frame That Has Been Passed to Ggplot()
How to Get My Blogdown Blog on R-Bloggers
Scaling Shiny Plots to Window Height
Simplest Way to Do Parallel Replicate
Dplyr Join Warning: Joining Factors with Different Levels
HTML with Multicolumn Table in Markdown Using Knitr
Plot a Legend and Well-Spaced Universal Y-Axis and Main Titles in Grid.Arrange
Make Dataframe of Top N Frequent Terms for Multiple Corpora Using Tm Package in R