Fitting multilevel models in R
Use lmer
and glmer
Although there are mutiple R packages which can fit mixed-effects regression
models, the lmer
and glmer
functions within the lme4
package are the most
frequently used, for good reason, and the examples below all use these two
functions.
p values in multilevel models
For various philosophical and statistical reasons the author of lme4, Doug Bates, has always refused to display p values in the output from lmer (his reasoning is explained here).
That notwithstanding, many people have wanted to use the various methods to
calculate p values for parameters in mixed models, and calculate F tests for
effects and interactions. Various methods have been developed over the years
which address at least some of Bates’ concerns, and these techniques have been
implemented in R in the lmerTest::
package. In particular, lmerTest
implements an anova
function for lmer
models, which is very helpful.
Don’t worry! All you need to do is to load the lmerTest
package rather
than lme4
. This loads updated versions of lmer
, glmer
, and extra functions
for things like calculating F tests and the Anova table.
The lmer
formula syntax
Specifying lmer
models is very similar to the syntax for lm
.
The ‘fixed’ part of the model is exactly the same, with additional parts used to
specify random intercepts,
random slopes, and control the covariances of these random
effects
(there’s more on this in the troubleshooting section).
Random intercepts
The simplest model which allows a ‘random intercept’ for each level in the grouping looks like this:
lmer(outcome ~ predictors + (1 | grouping), data=df)
Here the outcome and predictors are specified in a formula, just as we did when
using lm()
. The only difference is that we now add a ‘random part’ to the
model, in this case: (1|grouping)
.
The 1
refers to an intercept, and so in English this part of the formula means
‘add a random intercept for each level of grouping’.
Random slopes
If we want to add a random slope to the model, we could adjust the random part like so:
lmer(outcome ~ predictor + (predictor | grouping), data=df)
This implicitly adds a random intercept too, so in English this formula says
something like: let outcome
be predicted by predictor
; let variation in
outcome to vary between levels of grouping
, and also allow the effect of
predictor
to vary between levels of grouping
.
The lmer
syntax for the random part is very powerful, and allows complex
combinations of random intercepts and slopes and control over how these random
effects are allowed to correlate with one another. For a detailed guide to
fitting two and three level models, with various covariance structures, see:
http://rpsychologist.com/r-guide-longitudinal-lme-lmer
Are my effects fixed or random?
If you’re not sure which part of your model should be ‘fixed’ and which parts should be ‘random’ theres a more detailed explanation in this section.