## 11.1 Mediation using Path models

An even more flexible approach to mediation can be taken using path models, a type of structural equation model which are covered in more detail in the next section.

Using the lavaan package, path/SEM models can specify multiple variables to be outcomes, and fit these models simultaneously. For example, we can fit both step 2 and step 3 in a single model, as in the example below:

library(lavaan)
This is lavaan 0.6-3
lavaan is BETA software! Please report any bugs.

smash.model <- '
crashes ~ speed + lateness
speed ~ lateness
'

smash.model.fit <- sem(smash.model, data=smash)
summary(smash.model.fit)
lavaan 0.6-3 ended normally after 19 iterations

Optimization method                           NLMINB
Number of free parameters                          5

Number of observations                           200

Estimator                                         ML
Model Fit Test Statistic                       0.000
Degrees of freedom                                 0
Minimum Function Value               0.0000000000000

Parameter Estimates:

Information                                 Expected
Information saturated (h1) model          Structured
Standard Errors                             Standard

Regressions:
Estimate  Std.Err  z-value  P(>|z|)
crashes ~
speed             0.288    0.031    9.152    0.000
lateness          0.297    0.095    3.111    0.002
speed ~
lateness          0.515    0.212    2.434    0.015

Variances:
Estimate  Std.Err  z-value  P(>|z|)
.crashes          18.190    1.819   10.000    0.000
.speed            92.135    9.214   10.000    0.000

The summary output gives us coefficients which correspond to the regression coefficients in the step 2 and step 3 models — but this time, from a single model.

We can also use lavaan to compute the indirect effects by labelling the relevant parameters, using the * and := operators. See the lavaan syntax guide for mediation for more detail.

Note that the * operator does not have the same meaning as in formulas for linear models in R — in lavaan, it means ‘apply a constraint’.

smash.model <- '
crashes ~ B*speed + C*lateness
speed ~ A*lateness

# computed parameters, see http://lavaan.ugent.be/tutorial/mediation.html
indirect := A*B
total := C + (A*B)
proportion := indirect/total
'

smash.model.fit <- sem(smash.model, data=smash)
summary(smash.model.fit)
lavaan 0.6-3 ended normally after 19 iterations

Optimization method                           NLMINB
Number of free parameters                          5

Number of observations                           200

Estimator                                         ML
Model Fit Test Statistic                       0.000
Degrees of freedom                                 0
Minimum Function Value               0.0000000000000

Parameter Estimates:

Information                                 Expected
Information saturated (h1) model          Structured
Standard Errors                             Standard

Regressions:
Estimate  Std.Err  z-value  P(>|z|)
crashes ~
speed      (B)    0.288    0.031    9.152    0.000
lateness   (C)    0.297    0.095    3.111    0.002
speed ~
lateness   (A)    0.515    0.212    2.434    0.015

Variances:
Estimate  Std.Err  z-value  P(>|z|)
.crashes          18.190    1.819   10.000    0.000
.speed            92.135    9.214   10.000    0.000

Defined Parameters:
Estimate  Std.Err  z-value  P(>|z|)
indirect          0.148    0.063    2.353    0.019
total             0.445    0.112    3.973    0.000
proportion        0.333    0.121    2.756    0.006

We can again get a bootstrap interval for the indirect effect, and print a table of just these computed effects like so:

set.seed(1234)
smash.model.fit <- sem(smash.model, data=smash, test="bootstrap", bootstrap=100)

parameterEstimates(smash.model.fit) %>%
filter(op == ":=") %>%
select(label, est, contains("ci")) %>%
pander::pander()
label est ci.lower ci.upper
indirect 0.1481 0.02472 0.2715
total 0.4448 0.2254 0.6643
proportion 0.3329 0.09614 0.5697

Comparing these results with the mediation::mediate() output, we get similar results. In both cases, it’s possible to increase the number of bootstrap resamples if needed to increase the precision of the interval (the default is 1000, but 5000 might be a good target for publication).