Modification indices

Modification indices help us answer ‘what if?’ questions about whether freeing parameter constraints or adding paths to our models would help improve it. The modification index is the \(\chi^2\) value, with 1 degree of freedom, by which model fit would improve if a particular path was added or constraint freed. Values bigger than 3.84 indicate that the model would be ‘improved’, and the p value for the added parameter would be < .05, and values larger 10.83 than indicte the parameter would have a p vaue < .001. This does not mean that all parameters which appear in the MI table should be added, but it can be an aid to improving the model, in combination with domain or theoretical knowledge. The rule of thumb is to add parameters only when they ‘make sense’ substantively. See the notes on model improvements for more guidance.

To examine the modification indices we type:

modificationindices(hz.fit)

But because this function produces a very long table of output, it can be helpful to sort and filter the rows to show only those model modifications which might be of interest to us.

The command below converts the output of modificationindices() to a dataframe. It then:

Sorts the rows by the mi column, which represents the change in model \(\chi^2\) we see if the path was included (see sorting)
Filters the results to show only those with \(\chi^2\) change > 10
Selects only the lhs, op, rhs, mi, and epc columns.

modificationindices(hz.fit) %>%
  as_data_frame() %>%
  arrange(-mi) %>%
  filter(mi > 11) %>%
  select(lhs, op, rhs, mi, epc) %>%
  pander(caption="Largest MI values for hz.fit")

Largest MI values for hz.fit
lhs	op	rhs	mi	epc
visual	=~	x9	36.41	0.577
x7	~~	x8	34.15	0.5364
visual	=~	x7	18.63	-0.4219
x8	~~	x9	14.95	-0.4231

The lhs (left hand side, or outcome), rhs (right hand side, or predictor) and op (operation) columns specify what modification should be made.

Paths linking latent variables to the observed variables which index them have =~ in the ‘op’ column.

Error covariances for observed variables have ~~ as the op. These symbols match the symbols used to describe a path in the lavaan model syntax.

If we add the largest MI path to our model it will look like this:

# same model, but with x9 now loading on visual
hz.model.2 <- "
visual =~ x1 + x2 + x3 + x9
writing =~ x4 + x5 + x6
maths =~ x7 + x8 + x9"

hz.fit.2 <- cfa(hz.model.2, data=hz)
fitmeasures(hz.fit.2, c('cfi', 'rmsea', 'rmsea.ci.upper', 'bic'))
           cfi          rmsea rmsea.ci.upper            bic 
         0.967          0.065          0.089       7568.123

RMSEA has improved somewhat, but we’d probably want to investigate this model further, and make additional improvements to it (although see the notes on model improvements)