Lme4::Glmer VS. Stata's Melogit Command

lme4::glmer vs. Stata's melogit command

The glmer fit will probably be much faster with the optional argument nAGQ=0L. You have many fixed-effects parameters (20 levels for each of study_quarter and dd_quarter generate a total of 28 contrasts) and the default optimization method (corresponding to nAGQ=1L) puts all of those coefficients into the general nonlinear optimization call. With nAGQ=0L these coefficients are optimized within the much faster penalized iteratively reweighted least squares (PIRLS) algorithm. The default will generally provide a better estimate in the sense that the deviance at the estimate is lower, but the difference is usually very small and the time difference is enormous.

I have a write-up of the differences in these algorithms as a Jupyter notebook nAGQ.ipynb. That writeup uses the MixedModels package for Julia instead of lme4 but the methods are similar. (I am one of the authors of lme4 and the author of MixedModels.)

If you are going to be doing a lot of GLMM fits I would consider doing so in Julia with MixedModels. It is often much faster than R, even with all the complicated code in lme4.

How do I fit a quasi-poisson model with lme4 or glmmTMB?

The recipe given in the link will work just as well for (quasi)Poisson as for (quasi)binomial models. The key is that quasi-likelihood models really represent a post-fitting adjustment to the standard errors of the parameters and the associated statistics; they don't (or shouldn't ...) change anything about the way the model is fitted.

glmer is a bit fussy about "discrete responses" (binomial, Poisson, etc.) actually being discrete, but glmmTMB is looser/more forgiving.

This way of doing it puts as much of the variance as can be explained by the random effects there, then does a post hoc adjustment for any remaining over (or under)dispersion.

We'll use the grouseticks data set from Elston et al 2001 (the original analysis used observation-level random effects rather than quasi-likelihood to handle overdispersion at the level of individual observations (= number of ticks counted on a single chick, nested within brood, nested within location).

library(lme4)
g <- transform(grouseticks, sHEIGHT = drop(scale(HEIGHT)))
form <- TICKS~YEAR+sHEIGHT+(1|BROOD)+(1|LOCATION)
full_mod1  <- glmer(form, family="poisson", data=g)

There is moderate overdispersion: deviance(full_mod1)/df.residual(full_mod1) is 1.86. (Computing the ratio of (sum of squared Pearson residuals/residual df), as we will do below, is slightly more robust).

Unadjusted coefficient table:

printCoefmat(coef(summary(full_mod1)), digits=2)
            Estimate Std. Error z value Pr(>|z|)    
(Intercept)     0.47       0.19     2.4     0.02 *  
YEAR96          1.17       0.23     5.1    4e-07 ***
YEAR97         -0.98       0.25    -3.8    1e-04 ***
sHEIGHT        -0.85       0.12    -6.8    1e-11 ***

Now define the quasi-likelihood adjustment function (as in the link):

quasi_table <- function(model,ctab=coef(summary(model))) {
    phi <- sum(residuals(model, type="pearson")^2)/df.residual(model)
    qctab <- within(as.data.frame(ctab),
    {   `Std. Error` <- `Std. Error`*sqrt(phi)
        `z value` <- Estimate/`Std. Error`
        `Pr(>|z|)` <- 2*pnorm(abs(`z value`), lower.tail=FALSE)
    })
    return(qctab)
}

Apply it:

printCoefmat(quasi_table(full_mod1),digits=2)
            Estimate Std. Error z value Pr(>|z|)    
(Intercept)     0.47       0.25     1.8    0.065 .  
YEAR96          1.17       0.30     3.8    1e-04 ***
YEAR97         -0.98       0.34    -2.9    0.004 ** 
sHEIGHT        -0.85       0.16    -5.2    3e-07 ***

As specified, the estimates are identical; the standard errors and p-values have been appropriately inflated, the z-values have been appropriately deflated.

If you prefer your statistics "tidy":

library(tidyverse)
library(broom.mixed)
phi <- sum(residuals(full_mod1, type="pearson")^2)/df.residual(full_mod1)
(full_mod1 
  %>% tidy(effect="fixed")
  %>% transmute(term = term, estimate = estimate,
                std.error = std.error * sqrt(phi),
                statistic = estimate/std.error,
                p.value = 2*pnorm(abs(statistic), lower.tail=FALSE))
)
 term        estimate std.error statistic     p.value
  <chr>          <dbl>     <dbl>     <dbl>       <dbl>
1 (Intercept)    0.467     0.253      1.84 0.0654     
2 YEAR96         1.17      0.303      3.84 0.000121   
3 YEAR97        -0.978     0.336     -2.91 0.00363    
4 sHEIGHT       -0.847     0.164     -5.15 0.000000260

I don't know if any of the 'downstream' packages that people have built to handle mixed models (e.g. merTools, sjstats, etc.) have these capabilities. Arguably broom.mixed should/could build this in.
all of the code above should work equally well with glmmTMB() substituted for glmer, but I haven't tested it.

Lme4::Glmer VS. Stata's Melogit Command

lme4::glmer vs. Stata's melogit command

How do I fit a quasi-poisson model with lme4 or glmmTMB?

Related Topics

Leave a reply