Extract Random Effect Variances from Lme4 Mer Model Object

Extract random effect variances from lme4 mer model object

lmer returns an S4 object, so this should work:

remat <- summary(study)@REmat
print(remat, quote=FALSE)

Which prints:

 Groups   Name        Variance Std.Dev.
 Subject  (Intercept) 1378.18  37.124  
 Residual              960.46  30.991

...In general, you can look at the source of the print and summary methods for "mer" objects:

class(study) # mer
selectMethod("print", "mer")
selectMethod("summary", "mer")

Why won't VarCorr display variance for lmerModLmerTest or glmerMod objects?

comp is an argument to the print() method, not to VarCorr.

print(VarCorr(gm1), comp=c("Variance", "Std.Dev."))
 Groups   Name        Variance Std.Dev.
 herd     (Intercept) 44.404   6.6636  
 Residual             14.513   3.8096

Extract the confidence intervals of lmer random effects; plotted with dotplot(ranef())

rr <- ranef(fm1)  ## condVar = TRUE has been the default for a while

With as.data.frame: gives the conditional mode and SD, from which you can calculate the intervals (technically, these are not "confidence intervals" because the values of the BLUPs/conditional modes are not parameters ...)

dd <- as.data.frame(rr)
transform(dd, lwr = condval - 1.96*condsd, upr = condval + 1.96*condsd)

Or with broom.mixed::tidy:

broom.mixed::tidy(m1, effects = "ran_vals", conf.int = TRUE)

broom.mixed::tidy() uses as.data.frame.ranef.mer() (the method called by as.data.frame) internally: this function takes the rather complicated data structure described in ?lme4::ranef and extracts the conditional modes and standard deviations in a more user-friendly format:

If ‘condVar’ is ‘TRUE’ the ‘"postVar"’
attribute is an array of dimension j by j by k (or a list of such
arrays). The kth face of this array is a positive definite
symmetric j by j matrix. If there is only one grouping factor in
the model the variance-covariance matrix for the entire random
effects vector, conditional on the estimates of the model
parameters and on the data, will be block diagonal; this j by j
matrix is the kth diagonal block. With multiple grouping factors
the faces of the ‘"postVar"’ attributes are still the diagonal
blocks of this conditional variance-covariance matrix but the
matrix itself is no longer block diagonal.

In this particular case, here's what you need to do to replicate the condsd column of as.data.frame():

## get the 'postVar' attribute of the first (and only) RE term
aa <- attr(rr$Subject, "postVar")
## for each slice of the array, extract the diagonal;
##  transpose and drop dimensions;
##  take the square root
sqrt(c(t(apply(aa, 3, diag))))

Extract posterior estimate and credible intervals for random effect for lme4 model in R

To extract the estimate and 95% CI for your random effects, you use the following code:

sm.surv <-sim(m.surv)

#between Chick variance
bChick <-sm@ranef$Chick[,,1]
bvar<-as.vector(apply(bChick, 1, var)) #between ind variance posterior distribution
bvar<-as.mcmc(bvar)
posterior.mode(bvar) #mode of the distribution
HPDinterval(bvar)

This will then give you:

>     posterior.mode(bvar)
     var1 
     501.24353 
>     HPDinterval(bvar)
      lower   upper
var1 412.36042 630.201
attr(,"Probability")
[1] 0.95

This means that the estimate is 501 and the lower 95% interval was 412 and the upper 95% interval was 630.

Extract Fixed Effect and Random Effect in Dataframe

Use str to see the structure of an object.

str(fixEffect)
# named vector, can probably be coerced to data.frame

View(as.data.frame(fixEffect))
# works just fine

str(randEffect)
# list of data frames (well, list of one data frame in this case)

View(randEffect$Subject)

If you had, say, slopes that also varied by Subject, they would go in the same Subject data frame as the Subject level intercepts. However, if intercepts also varied by some other variable group, with a different number of level than Subject, they obviously couldn't go in the same data frame. This is why a list of data frames is used, so that the same structure can generalize up for more complex models.

Fixing variance values in lme4

This is possible, if a little hacky. Here's a reproducible example:

Fit the original model:

library(lme4)
set.seed(101)
ss <- sleepstudy[sample(nrow(sleepstudy),size=round(0.9*nrow(sleepstudy))),]
m1 <- lmer(Reaction~Days+(1|Subject)+(0+Days|Subject),ss)
fixef(m1)
## (Intercept)        Days 
##   251.55172    10.37874

Recover the deviance (in this case REML-criterion) function:

dd <- as.function(m1)

I'm going to set the standard deviations to zero so that I have something to compare with, i.e. the coefficients of the regular linear model. (The parameter vector for dd is a vector containing the column-wise, lower-triangular, concatenated Cholesky factors for the scaled random effects terms in the model. Luckily, if all you have are scalar/intercept-only random effects (e.g. (1|x)), then these correspond to the random-effects standard deviations, scaled by the model standard deviation).

(ff <- dd(c(0,0)))  ## new REML: 1704.708
environment(dd)$pp$beta(1)  ## new parameters
##    [1] 251.11920  10.56979

Matches:

coef(lm(Reaction~Days,ss))
## (Intercept)        Days 
##   251.11920    10.56979

If you want to construct a new merMod object you can do it as follows ...

opt <- list(par=c(0,0),fval=ff,conv=0)
lmod <- lFormula(Reaction~Days+(1|Subject)+(0+Days|Subject),ss)
m1X <- mkMerMod(environment(dd), opt, lmod$reTrms, fr = lmod$fr,
         mc = quote(hacked_lmer()))

Now suppose we want to set the variances to particular non-zero value (say (700,30)). This will be a little bit tricky because of the scaling by the residual standard deviation ...

newvar <- c(700,30)
ff2 <- dd(sqrt(newvar)/sigma(m1))
opt2 <- list(par=c(0,0),fval=ff,conv=0)
m2X <- mkMerMod(environment(dd), opt, lmod$reTrms, fr = lmod$fr,
         mc = quote(hacked_lmer()))
VarCorr(m2X)
unlist(VarCorr(m2X))
##   Subject Subject.1 
## 710.89304  30.46684

So this doesn't get us quite where we wanted (because the residual variance changes ...)

buildMM <- function(theta) {
   dd <- as.function(m1)
   ff <- dd(theta)
   opt <- list(par=c(0,0),fval=ff,conv=0)
   mm <- mkMerMod(environment(dd), opt, lmod$reTrms, fr = lmod$fr,
         mc = quote(hacked_lmer()))
   return(mm)
}

objfun <- function(x,target=c(700,30)) {
   mm <- buildMM(sqrt(x))
   return(sum((unlist(VarCorr(mm))-target)^2))
}
s0 <- c(700,30)/sigma(m1)^2
opt <- optim(fn=objfun,par=s0)
mm_final <- buildMM(sqrt(opt$par))
summary(mm_final)
##  Random effects:
##  Groups    Name        Variance Std.Dev.
##  Subject   (Intercept) 700      26.458  
##  Subject.1 Days         30       5.477  
##  Residual              700      26.458  
## Number of obs: 162, groups:  Subject, 18
## 
## Fixed effects:
##             Estimate Std. Error t value
## (Intercept)  251.580      7.330   34.32
## Days          10.378      1.479    7.02

By the way, it's not generally recommended to use random effects when the grouping variables have a very small number (e.g. <5 or 6) levels: see here ...

R: Modeling random intercepts from lme4 or brms objects

I don't know about lme4 or brms, but it can be done directly in Stan. Here's a way to replicate your model, but with everything jointly estimated; see this example (in Python, not R) and this paper for more.

The model

We model the observed values of y with by-person random intercepts (one $alpha$ for each person j) and a scale parameter $sigma$ .

$model of outcomes$

We model the by-person random intercepts as a function of the person's sex and the coefficient $beta$ , plus a second scale parameter $sigma alpha$ .

$model of intercepts$

The Stan code

Here's the Stan code for the model described above. It uses a non-centered parameterization: "raw" values of $alpha$ have a standard normal prior, and we convert to the "real" values of $alpha$ using $beta$ and $sigma alpha$ .

data {
  int<lower=0> N_obs; // number of observations
  int<lower=2> N_person; // number of persons
  int<lower=1,upper=N_person> person[N_obs]; // person associated with each observation
  vector<lower=0,upper=1>[N_person] sex; // sex of each person
  vector[N_obs] y; // observed outcomes
}

parameters {
  real<lower=0> sigma; // observation-level variation
  vector[N_person] alpha_person_raw; // random intercepts for persons
  real mu_alpha; // mean of random intercepts for persons
  real<lower=0> sigma_alpha; // scale of random intercepts for persons
  real beta; // coefficients for sex
}

transformed parameters {
  vector[N_person] alpha_person = (alpha_person_raw * sigma_alpha) +
                                  mu_alpha +
                                  (sex * beta);
}

model {
  sigma ~ exponential(1);
  alpha_person_raw ~ std_normal();
  mu_alpha ~ std_normal();
  sigma_alpha ~ exponential(1);
  beta ~ std_normal();
  y ~ normal(alpha_person[person], sigma);
}

Fitting the Stan model

I recreated the dataset using the same rules as in the original example, but in a slightly different format that's easier for Stan to work with.

library(tidyverse)
person.df = data.frame(person = 1:N, sex = rbinom(N, size = 1, prob = 0.5))
obs.df = data.frame(person = rep(1:N, time),
                    x = rnorm(N * time)) %>%
  mutate(y = 2 + (x * runif(N * time)))

library(rstan)
stan.data = list(
  N_obs = nrow(obs.df),
  N_person = nrow(person.df),
  person = obs.df$person,
  sex = person.df$sex,
  y = obs.df$y
)
stan.model = stan("two_level_model.stan", data = stan.data, chains = 3)

Comparing the two methods

First, I re-fit the two-stage model using the new datasets.

first.level.m = lmer(y ~ -1 + (1 | person), obs.df)
second.level.m = lm(intercept ~ 1 + sex,
                    person.df %>%
                      mutate(intercept = ranef(first.level.m)$person[["(Intercept)"]]))

Now, let's compare the estimated parameter values. The two approaches agree that the effect of sex includes 0 and that the average of the random intercepts is approximately 2. (Naturally, these estimates are influenced by the fact that x isn't yet included as a predictor in the models.) They estimate approximately the same error term at the observation level, but the joint model estimates a substantially smaller error term for the model of the random intercepts. I'm not sure how the model-fitting procedure of lme4 relates to the exponential prior I put on $sigma alpha$ in the Stan model.

library(tidybayes)
bind_rows(
  spread_draws(stan.model, mu_alpha, beta, sigma_alpha, sigma) %>%
    ungroup() %>%
    dplyr::select(.draw, mu_alpha, beta, sigma_alpha, sigma) %>%
    pivot_longer(cols = -.draw, names_to = "parameter") %>%
    group_by(parameter) %>%
    summarise(lower.95 = quantile(value, 0.025),
              lower.50 = quantile(value, 0.25),
              est = median(value),
              upper.50 = quantile(value, 0.75),
              upper.95 = quantile(value, 0.975)) %>%
    ungroup() %>%
    mutate(parameter = case_when(parameter == "mu_alpha" ~ "mu[alpha]",
                                 parameter == "beta" ~ "beta",
                                 parameter == "sigma_alpha" ~ "sigma[alpha]",
                                 parameter == "sigma" ~ "sigma"),
           model = "joint"),
  summary(second.level.m)$coefficients %>%
    data.frame() %>%
    rownames_to_column("parameter") %>%
    mutate(lower.95 = Estimate + (Std..Error * qt(0.025, second.level.m$df.residual)),
           lower.50 = Estimate + (Std..Error * qt(0.25, second.level.m$df.residual)),
           est = Estimate,
           upper.50 = Estimate + (Std..Error * qt(0.75, second.level.m$df.residual)),
           upper.95 = Estimate + (Std..Error * qt(0.975, second.level.m$df.residual)),
           parameter = case_when(parameter == "(Intercept)" ~ "mu[alpha]",
                                 parameter == "sex" ~ "beta"),
           model = "two-stage") %>%
    dplyr::select(model, parameter, est, matches("lower|upper")),
  data.frame(est = summary(second.level.m)$sigma,
             parameter = "sigma[alpha]",
             model = "two-stage"),
  data.frame(est = summary(first.level.m)$sigma,
             parameter = "sigma",
             model = "two-stage")
) %>%
  mutate(parameter = fct_relevel(parameter, "beta", "mu[alpha]", "sigma[alpha]",
                                 "sigma")) %>%
  ggplot(aes(x = parameter, color = model)) +
  geom_linerange(aes(ymin = lower.95, ymax = upper.95), size = 1,
                 position = position_dodge(width = 0.3)) +
  geom_linerange(aes(ymin = lower.50, ymax = upper.50), size = 2,
                 position = position_dodge(width = 0.3)) +
  geom_point(aes(y = est), size = 3, position = position_dodge(width = 0.3)) +
  scale_x_discrete(labels = scales::parse_format()) +
  labs(x = "Parameter", y = "Estimated parameter value", color = "Model") +
  coord_flip() +
  theme_bw()

Sample Image

Extract treatment means from an lmer object and calculate error bars

An alternative that uses functions in package languageR. I call your data set df.

library(lme4)
library(languageR)
library(ggplot2)

# fit model
# n.b. I don't claim that this is a sensible model
# It is just used to demonstrate the plot
mod <- lmer(DV ~ TMT1 * TMT2 + (1|Block), data = df)

# create MCMC matrix
mcmc <- pvals.fnc(mod, nsim = 1000, withMCMC = TRUE)
# pval.fnc also calculates MCMC-based p-values and HPD confidence intervals,
# and plot the posterior distributions of the parameters

# plot using plotLMER.fnc 
# in addition, set withList = TRUE to create a list of data frames with plot data
# which can be used for a (possibly prettier) plot in ggplot
ll <- plotLMER.fnc(mod, withList = TRUE, pred = "TMT1", 
               intr = list(
                 "TMT2",
                 c("C", "D"),
                 "end",
                 list(c("red",  "blue"), rep(1, 2))),
               addlines = TRUE,
               mcmcMat = mcmc$mcmc)

 # here follows additional steps to plot using ggplot 

 # convert list to data frame
 df <- do.call(rbind, ll$TMT1)

 # rename 
 names(df)[names(df) == "Levels"] <- "TMT1"

 # add TMT2
 df$TMT2 <- rep(c("C", "D"), each = 2)

# plot using ggplot
dodge <- position_dodge(width = 0.1)
ggplot(data = df, aes(x = TMT1, y = Y, col = TMT2, group = TMT2)) +
   geom_point(position = dodge, size = 3) +
   geom_errorbar(aes(ymax = upper, ymin = lower, width = 0.1), position = dodge) +
   geom_line(position = dodge) +
   ylab("DV") +
   theme_classic()

Extract Random Effect Variances from Lme4 Mer Model Object