7.11 Future research

The first edition of this book featured only three pages on multilevel imputation, and concluded: “Imputation of multilevel data is an area where work still remains to be done” (Van Buuren 2012, 87). The progress over the last few years has been tremendous, and we can now see the contours of an emerging methodology. There are still open issues, and we may expect to see further advances in the near future.

The multilevel model does not assume the regressions to be identical in different subsets of the data. This allows for more general and interesting patterns in the data to be studied, but the added flexibility comes at the price of increased modeling effort. The current software needs to become more robust and forgiving, so that application of multilevel imputation eventually becomes a routine component of multilevel analysis. We need faster imputation algorithms, automatic model specification, and good defaults that will work across a wide variety of practical data types and models. We also need more experience with imputation in three-level data, and beyond, e.g., as supported by Blimp and ml.lmer, as well as more experience in handling of categorical data with many categories. We need better insight into the convergence properties, and more generally into the strengths and limitations of the procedures.

There is little consensus about the optimal way to handle interaction effects in multiple imputation. I used passive imputation because it is easy to apply in standard software, and has been found to work reasonably well. In the future we may see model-based imputation procedures that enhance the handling of interactions by combining the imputation and analysis models into larger Bayesian models. See Section 4.5.5 for some pointers into the literature.