7.8 Imputation of level-2 variable

The typical fix for missing values in a level-2 predictor is to delete all records in the cluster. Despite its potential impact on the analyses, the problem of incomplete level-2 predictors thus far received less attention than missingness in level-1 predictors.

Some authors studied the use of (inappropriate) single-level imputation methods that ignore the hierarchical group structure in multilevel data. Standard errors are underestimated, leading to confidence intervals that are too short. Early attempts to solve the problem with multiple imputation (Gibson and Olejnik 2003; Cheung 2007) were not successful.

Imputation methods for level-2 predictors should assign the same imputed value to all members within the same class. More recent attempts create two datasets, one with level-1 data, and one with level-2 data, and do separate imputations within each dataset while using the results from one in the other. Of course, these steps can be iterated (Gelman and Hill 2007; Grund, Lüdtke, and Robitzsch 2018a).

The mice package contains several functions whose names start with mice.impute.2lonly. Method 2lonly.mean fills in the class mean, and is primarily useful to repair errors in the data. Methods 2lonly.norm and 2lonly.pmm aggregate level-1 predictors, and impute the level-2 variables by the normal model and by predictive mean matching, respectively. The miceadds package contains two generic functions. The method 2lonly.function allows the user to specify any univariate imputation function designed for level-1 data at level-2.

It is conceptually straightforward to extend imputations to higher levels (Yucel 2008). If there are two levels, combine all level-2 predictors with an aggregate (e.g., the cluster means) of the level-1 predictors and the level-1 outcomes. Once we have this, we may choose suitable methods from Chapter 3 to impute the missing level-2 variables in the usual way. No new issues arise.

Method ml.lmer from miceadds implements a generalization to three or more levels. In addition, it also allows imputation at the lowest level (and any other level) with an arbitrary specification of (additive) random effects. This includes general nested models, cross-classified models, the ability to include cluster means at any level of clustering, and the specification of random slopes at any level of clustering. Table 7.4 lists the various methods.

Table 7.4: Overview of mice.impute.[method] functions to perform univariate multilevel imputation.
Package Method Description
Level-2
mice 2lonly.mean level-2 manifest class mean
miceadds 2l.groupmean level-2 manifest class mean
miceadds 2l.latentgroupmean level-2 latent class mean
mice 2lonly.norm level-2 class normal
mice 2lonly.pmm level-2 class pmm
miceadds 2lonly.function level-2 class, generic
miceadds ml.lmer \(\geq 2\) levels, generic