## 7.9 Comparative work

Several comparisons on multilevel imputation methods are available. This section is a short summary of the main findings.

Enders, Mistler, and Keller (2016) compared JM and FCS multilevel approaches, and found that both JM and FCS imputation are appropriate for random intercept analyses. The JM method was found to be superior for analyses that focus on different within- and between-cluster associations, whereas FCS provided a dramatic improvement over the JM in random slope models. Moreover, it turned out that the use of a latent variable for imputation of categorical variables worked well.

Mistler and Enders (2017) showed that more flexible and modern imputation methods for JM and FCS are preferable to older methods that assume homoscedastic distributions or multivariate normality. For random intercept models, JM and FCS are about equally good. The authors noted that JM does not preserve random slope variation, whereas FCS does.

Kunkel and Kaizar (2017) compared JM and FCS for models for random intercepts in the context of individual patient data. They found that, in spite of the theoretical differences, FCS and JM produced similar results. Moreover these authors highlighted that results were sensitive to the choice of the prior in high missingness scenarios.

Grund, Lüdtke, and Robitzsch (2018b) present a detailed comparison between JM, FCS and FIML using current implementations. For random intercept models, they found JM and FCS equally effective, and better than ad-hoc approaches or FIML. A difference with Enders, Mistler, and Keller (2016) was the addition of FCS methods that included cluster means. For models with random slopes and cross-level interactions, FCS was found almost unbiased for the main effects, but less reliable for higher-order terms. For categorical data, the conclusion was that both multilevel JM and FCS are suitable for creating multiple imputations. Incomplete level-2 variables were handled equally well by JM, FCS and FIML.

Audigier et al. (2018) found that JM, as implemented in `jomo`

, worked well with large clusters and binary data, but had difficulty in modeling small (number of) clusters, tending to conservative inferences. The homogeneity assumption in the standard generalized linear mixed model was found to be limiting. The two-stage approach was found to perform well for systematically missing data, but was less reliable for small clusters.

The picture that emerges is that FIML is not inherently preferable for missing predictors or outcomes. Modern versions of JM and FCS are reliable ways of dealing with missing data in multilevel models with random intercepts. The FCS framework seems better suited to accommodate models with random slopes, but may have difficulty with higher-order terms.