7.8 Imputation of level-2 variable
The typical fix for missing values in a level-2 predictor is to delete all records in the cluster. Despite its potential impact on the analyses, the problem of incomplete level-2 predictors thus far received less attention than missingness in level-1 predictors.
Some authors studied the use of (inappropriate) single-level imputation methods that ignore the hierarchical group structure in multilevel data. Standard errors are underestimated, leading to confidence intervals that are too short. Early attempts to solve the problem with multiple imputation (Gibson and Olejnik 2003; Cheung 2007) were not successful.
Imputation methods for level-2 predictors should assign the same imputed value to all members within the same class. More recent attempts create two datasets, one with level-1 data, and one with level-2 data, and do separate imputations within each dataset while using the results from one in the other. Of course, these steps can be iterated (Gelman and Hill 2007; Grund, Lüdtke, and Robitzsch 2018a).
The mice
package contains several functions whose names start with mice.impute.2lonly
. Method 2lonly.mean
fills in the class mean, and is primarily useful to repair errors in the data. Methods 2lonly.norm
and 2lonly.pmm
aggregate level-1 predictors, and impute the level-2 variables by the normal model and by predictive mean matching, respectively. The miceadds
package contains two generic functions. The method 2lonly.function
allows the user to specify any univariate imputation function designed for level-1 data at level-2.
It is conceptually straightforward to extend imputations to higher levels (Yucel 2008). If there are two levels, combine all level-2 predictors with an aggregate (e.g., the cluster means) of the level-1 predictors and the level-1 outcomes. Once we have this, we may choose suitable methods from Chapter 3 to impute the missing level-2 variables in the usual way. No new issues arise.
Method ml.lmer
from miceadds
implements a generalization to three or more levels. In addition, it also allows imputation at the lowest level (and any other level) with an arbitrary specification of (additive) random effects. This includes general nested models, cross-classified models, the ability to include cluster means at any level of clustering, and the specification of random slopes at any level of clustering. Table 7.4 lists the various methods.
Package | Method | Description |
---|---|---|
Level-2 | ||
mice |
2lonly.mean |
level-2 manifest class mean |
miceadds |
2l.groupmean |
level-2 manifest class mean |
miceadds |
2l.latentgroupmean |
level-2 latent class mean |
mice |
2lonly.norm |
level-2 class normal |
mice |
2lonly.pmm |
level-2 class pmm |
miceadds |
2lonly.function |
level-2 class, generic |
miceadds |
ml.lmer |
\(\geq 2\) levels, generic |