Variance partition coefficients and intraclass correlations
The purpose of multilevel models is to partition variance in the outcome between the different groupings in the data.
For example, if we make multiple observations on individual participants we partition outcome variance between individuals, and the residual variance.
We might then want to know what proportion of the total variance is attributable to variation within-groups, or how much is found between-groups. This statistic is termed the variance partition coefficient VPC, or intraclass correlation.
We calculate the VPC woth some simple arithmetic on the variance estimates from the lmer model. We can extract the variance estimates from the VarCorr function:
random.intercepts.model <- lmer(Reaction ~ Days + (1|Subject), data=lme4::sleepstudy)
VarCorr(random.intercepts.model)
Groups Name Std.Dev.
Subject (Intercept) 37.124
Residual 30.991
And we can test the variance parameter using the rand()
function:
rand(random.intercepts.model)
ANOVA-like table for random-effects: Single term deletions
Model:
Reaction ~ Days + (1 | Subject)
npar logLik AIC LRT Df Pr(>Chisq)
<none> 4 -893.23 1794.5
(1 | Subject) 3 -946.83 1899.7 107.2 1 < 2.2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Helpfully, if we convert the result of VarCorr
to a dataframe, we are provided
with the columns vcov
which stands for variance or covariance
, as well as
the sdcor
(standard deviation or correlation) which is provided in the printed
summary:
VarCorr(random.intercepts.model) %>%
as_data_frame()
Warning: `as_data_frame()` is deprecated, use `as_tibble()` (but mind the new semantics).
This warning is displayed once per session.
# A tibble: 2 x 5
grp var1 var2 vcov sdcor
<chr> <chr> <chr> <dbl> <dbl>
1 Subject (Intercept) <NA> 1378. 37.1
2 Residual <NA> <NA> 960. 31.0
The variance partition coefficient is simply the variance at a given level of the model, divided by the total variance (the sum of the variance parameters). So we can write:
VarCorr(random.intercepts.model) %>%
as_data_frame() %>%
mutate(icc=vcov/sum(vcov)) %>%
select(grp, icc)
# A tibble: 2 x 2
grp icc
<chr> <dbl>
1 Subject 0.589
2 Residual 0.411
Intraclass correlations were computed from the mixed effects mode. 59% of the variation in outcome was attributable to differences between subjects, \(\chi^2(1) = 107\), p < .001.
[It’s not straightforward to put an confidence interval around the VPC estimate from an lmer model. If this is important to you, you should explore re-fitting the same model in a Bayesian framework]