Fitting multilevel models in R

Use lmer and glmer

Although there are mutiple R packages which can fit mixed-effects regression models, the lmer and glmer functions within the lme4 package are the most frequently used, for good reason, and the examples below all use these two functions.

p values in multilevel models

For various philosophical and statistical reasons the author of lme4, Doug Bates, has always refused to display p values in the output from lmer (his reasoning is explained here).

That notwithstanding, many people have wanted to use the various methods to calculate p values for parameters in mixed models, and calculate F tests for effects and interactions. Various methods have been developed over the years which address at least some of Bates’ concerns, and these techniques have been implemented in R in the lmerTest:: package. In particular, lmerTest implements an anova function for lmer models, which is very helpful.

Don’t worry! All you need to do is to load the lmerTest package rather than lme4. This loads updated versions of lmer, glmer, and extra functions for things like calculating F tests and the Anova table.

The lmer formula syntax

Specifying lmer models is very similar to the syntax for lm. The ‘fixed’ part of the model is exactly the same, with additional parts used to specify random intercepts, random slopes, and control the covariances of these random effects (there’s more on this in the troubleshooting section).

Random intercepts

The simplest model which allows a ‘random intercept’ for each level in the grouping looks like this:

lmer(outcome ~ predictors + (1 | grouping), data=df)

Here the outcome and predictors are specified in a formula, just as we did when using lm(). The only difference is that we now add a ‘random part’ to the model, in this case: (1|grouping).

The 1 refers to an intercept, and so in English this part of the formula means ‘add a random intercept for each level of grouping’.

Random slopes

If we want to add a random slope to the model, we could adjust the random part like so:

lmer(outcome ~ predictor + (predictor | grouping), data=df)

This implicitly adds a random intercept too, so in English this formula says something like: let outcome be predicted by predictor; let variation in outcome to vary between levels of grouping, and also allow the effect of predictor to vary between levels of grouping.

The lmer syntax for the random part is very powerful, and allows complex combinations of random intercepts and slopes and control over how these random effects are allowed to correlate with one another. For a detailed guide to fitting two and three level models, with various covariance structures, see:

Are my effects fixed or random?

If you’re not sure which part of your model should be ‘fixed’ and which parts should be ‘random’ theres a more detailed explanation in this section.