Variance partition coefficients and intraclass correlations

The purpose of multilevel models is to partition variance in the outcome between the different groupings in the data.

For example, if we make multiple observations on individual participants we partition outcome variance between individuals, and the residual variance.

We might then want to know what proportion of the total variance is attributable to variation within-groups, or how much is found between-groups. This statistic is termed the variance partition coefficient VPC, or intraclass correlation.

We calculate the VPC woth some simple arithmetic on the variance estimates from the lmer model. We can extract the variance estimates from the VarCorr function:

random.intercepts.model <- lmer(Reaction ~ Days + (1|Subject),  data=lme4::sleepstudy)
 Groups   Name        Std.Dev.
 Subject  (Intercept) 37.124  
 Residual             30.991  

And we can test the variance parameter using the rand() function:

ANOVA-like table for random-effects: Single term deletions

Reaction ~ Days + (1 | Subject)
              npar  logLik    AIC   LRT Df Pr(>Chisq)    
<none>           4 -893.23 1794.5                        
(1 | Subject)    3 -946.83 1899.7 107.2  1  < 2.2e-16 ***
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Helpfully, if we convert the result of VarCorr to a dataframe, we are provided with the columns vcov which stands for variance or covariance, as well as the sdcor (standard deviation or correlation) which is provided in the printed summary:

VarCorr(random.intercepts.model) %>%
Warning: `as_data_frame()` is deprecated, use `as_tibble()` (but mind the new semantics).
This warning is displayed once per session.
# A tibble: 2 x 5
  grp      var1        var2   vcov sdcor
  <chr>    <chr>       <chr> <dbl> <dbl>
1 Subject  (Intercept) <NA>  1378.  37.1
2 Residual <NA>        <NA>   960.  31.0

The variance partition coefficient is simply the variance at a given level of the model, divided by the total variance (the sum of the variance parameters). So we can write:

VarCorr(random.intercepts.model) %>%
  as_data_frame() %>%
  mutate(icc=vcov/sum(vcov)) %>%
  select(grp, icc)
# A tibble: 2 x 2
  grp        icc
  <chr>    <dbl>
1 Subject  0.589
2 Residual 0.411

Intraclass correlations were computed from the mixed effects mode. 59% of the variation in outcome was attributable to differences between subjects, \(\chi^2(1) = 107\), p < .001.

[It’s not straightforward to put an confidence interval around the VPC estimate from an lmer model. If this is important to you, you should explore re-fitting the same model in a Bayesian framework]