2.7 When not to use multiple imputation
Should we always use multiple imputation for the missing data? We probably could, but there are good alternatives in some situations. Section 1.6 already discussed some approaches not covered in this book, each of which has its merits. This section revisits complete-case analysis. Apart from being simple to apply, it can be a viable alternative to multiple imputation in particular situations.
Suppose that the complete-data model is a regression with outcome \(Y\) and predictors \(X\). If the missing data occur in \(Y\) only, complete-case analysis and multiple imputation are equivalent, so then complete-case analysis is preferred since it is easier, more efficient and more robust (Von Hippel 2007). This applies to the regression weights. Quantities that depend on the correct marginal distribution of \(Y\), such as the mean or \(R^2\), require the stronger MCAR assumption. Multiple imputation gains an advantage over complete-case analysis if additional predictors for \(Y\) are available that are not part of \(X\). The efficiency of complete-case analysis declines if \(X\) contains missing values, which may result in inflated type II error rates. Complete-case analysis can perform quite badly under MAR and some MNAR cases (Schafer and Graham 2002), but there are two special cases where listwise deletation outperforms multiple imputation.
The first special case occurs if the probability to be missing does not depend on \(Y\). Under the assumption that the complete-data model is correct, the regression coefficients are free of bias (Little 1992; King et al. 2001). This holds for any type of regression analysis, and for missing data in both \(Y\) and \(X\). Since the missing data rate may depend on \(X\), complete-case analysis will in fact work in a relevant class of MNAR models. White and Carlin (2010) confirmed the superiority of complete-case analysis by simulation. The differences were often small, and multiple imputation gained the upper hand as more predictive variables were included. The property is useful though in practice.
The second special case holds only if the complete data model is logistic regression. Suppose that the missing data are confined to either a dichotomous \(Y\) or to \(X\), but not to both. Assuming that the model is correctly specified, the regression coefficients (except the intercept) from the complete-case analysis are unbiased if the probability to be missing depends only on \(Y\) and not on \(X\) (Vach 1994). This property provides the statistical basis of the estimation of the odds ratio from case-control studies in epidemiology. If missing data occur in both \(Y\) and \(X\) the property does not hold.
At a minimum, application of listwise deletion should be a conscious decision of the analyst, and should preferably be accompanied by an explicit statement that the missing data fit in one of the three categories described above.
Other alternatives to multiple imputation were briefly reviewed in Section 1.6, and may work well in particular applications. However, none of these is as general as multiple imputation.