lme4::glmer vs. Stata's melogit command
The glmer
fit will probably be much faster with the optional argument nAGQ=0L
. You have many fixed-effects parameters (20 levels for each of study_quarter
and dd_quarter
generate a total of 28 contrasts) and the default optimization method (corresponding to nAGQ=1L
) puts all of those coefficients into the general nonlinear optimization call. With nAGQ=0L
these coefficients are optimized within the much faster penalized iteratively reweighted least squares (PIRLS) algorithm. The default will generally provide a better estimate in the sense that the deviance at the estimate is lower, but the difference is usually very small and the time difference is enormous.
I have a write-up of the differences in these algorithms as a Jupyter
notebook nAGQ.ipynb
. That writeup uses the MixedModels
package for Julia
instead of lme4
but the methods are similar. (I am one of the authors of lme4
and the author of MixedModels
.)
If you are going to be doing a lot of GLMM fits I would consider doing so in Julia
with MixedModels
. It is often much faster than R
, even with all the complicated code in lme4
.
How do I fit a quasi-poisson model with lme4 or glmmTMB?
The recipe given in the link will work just as well for (quasi)Poisson as for (quasi)binomial models. The key is that quasi-likelihood models really represent a post-fitting adjustment to the standard errors of the parameters and the associated statistics; they don't (or shouldn't ...) change anything about the way the model is fitted.
glmer
is a bit fussy about "discrete responses" (binomial, Poisson, etc.) actually being discrete, but glmmTMB
is looser/more forgiving.
This way of doing it puts as much of the variance as can be explained by the random effects there, then does a post hoc adjustment for any remaining over (or under)dispersion.
We'll use the grouseticks
data set from Elston et al 2001 (the original analysis used observation-level random effects rather than quasi-likelihood to handle overdispersion at the level of individual observations (= number of ticks counted on a single chick, nested within brood, nested within location).
library(lme4)
g <- transform(grouseticks, sHEIGHT = drop(scale(HEIGHT)))
form <- TICKS~YEAR+sHEIGHT+(1|BROOD)+(1|LOCATION)
full_mod1 <- glmer(form, family="poisson", data=g)
There is moderate overdispersion: deviance(full_mod1)/df.residual(full_mod1)
is 1.86. (Computing the ratio of (sum of squared Pearson residuals/residual df), as we will do below, is slightly more robust).
Unadjusted coefficient table:
printCoefmat(coef(summary(full_mod1)), digits=2)
Estimate Std. Error z value Pr(>|z|)
(Intercept) 0.47 0.19 2.4 0.02 *
YEAR96 1.17 0.23 5.1 4e-07 ***
YEAR97 -0.98 0.25 -3.8 1e-04 ***
sHEIGHT -0.85 0.12 -6.8 1e-11 ***
Now define the quasi-likelihood adjustment function (as in the link):
quasi_table <- function(model,ctab=coef(summary(model))) {
phi <- sum(residuals(model, type="pearson")^2)/df.residual(model)
qctab <- within(as.data.frame(ctab),
{ `Std. Error` <- `Std. Error`*sqrt(phi)
`z value` <- Estimate/`Std. Error`
`Pr(>|z|)` <- 2*pnorm(abs(`z value`), lower.tail=FALSE)
})
return(qctab)
}
Apply it:
printCoefmat(quasi_table(full_mod1),digits=2)
Estimate Std. Error z value Pr(>|z|)
(Intercept) 0.47 0.25 1.8 0.065 .
YEAR96 1.17 0.30 3.8 1e-04 ***
YEAR97 -0.98 0.34 -2.9 0.004 **
sHEIGHT -0.85 0.16 -5.2 3e-07 ***
As specified, the estimates are identical; the standard errors and p-values have been appropriately inflated, the z-values have been appropriately deflated.
If you prefer your statistics "tidy":
library(tidyverse)
library(broom.mixed)
phi <- sum(residuals(full_mod1, type="pearson")^2)/df.residual(full_mod1)
(full_mod1
%>% tidy(effect="fixed")
%>% transmute(term = term, estimate = estimate,
std.error = std.error * sqrt(phi),
statistic = estimate/std.error,
p.value = 2*pnorm(abs(statistic), lower.tail=FALSE))
)
term estimate std.error statistic p.value
<chr> <dbl> <dbl> <dbl> <dbl>
1 (Intercept) 0.467 0.253 1.84 0.0654
2 YEAR96 1.17 0.303 3.84 0.000121
3 YEAR97 -0.978 0.336 -2.91 0.00363
4 sHEIGHT -0.847 0.164 -5.15 0.000000260
- I don't know if any of the 'downstream' packages that people have built to handle mixed models (e.g.
merTools
,sjstats
, etc.) have these capabilities. Arguablybroom.mixed
should/could build this in. - all of the code above should work equally well with
glmmTMB()
substituted forglmer
, but I haven't tested it.
Related Topics
Create Polygon from Set of Points Distributed
R How to Extract First Row of Each Matrix Within a List
References Truncated in Beamer Presentation Prepared in Knitr/Rmarkdown
Converting R Matrix into Latex Matrix in the Math or Equation Environment
How to Prevent Objects from Automatically Loading When I Open Rstudio
Model Matrix with All Pairwise Interactions Between Columns
Changing Word Template for Knitr in Rmarkdown
How to Speed Up R Packages Installation in Docker
A Way to Access Google Streetview from R
Subsetting a Data.Table by Range Making Use of Binary Search
Creating a Sankey Diagram Using Networkd3 Package in R
Warning: Unable to Access Index for Repository Https://Www.Stats.Ox.Ac.Uk/Pub/Rwin/Src/Contrib:
How to Create a List in R from Two Vectors (One Would Be the Keys, the Other the Values)
R - How to Add Row Index to a Data Frame, Based on Combination of Factors
Ggplot2 Bar Plot with Two Categorical Variables
What Is R's Crossproduct Function