5.2 Parameter pooling

5.2.1 Scalar inference of normal quantities

Section 2.4 describes Rubin’s rules for pooling the results from the \(m\) complete-data analyses. These rules are based on the assumption that the parameter estimates \(\hat Q\) are normally distributed around the population value \(Q\) with a variance of \(U\). Many types of estimates are approximately normally distributed, e.g., means, standard deviations, regression coefficients, proportions and linear predictors. Rubin’s pooling rules can be applied directly to such quantities (Schafer 1997; Marshall, Billingham, and Bryan 2009).

5.2.2 Scalar inference of non-normal quantities

How should we combine quantities with non-normal distributions: correlation coefficients, odds ratios, relative risks, hazard ratios, measures of explained variance and so on? The quality of the pooled estimate and the confidence intervals can be improved when pooling is done in a scale for which the distribution is close to normal. Thus, transformation toward normality and back-transformation into the original scale improves statistical inference.

As an example, consider transforming a correlation coefficient \(\rho_\ell\) for \(\ell=1,\dots,m\) toward normality using the Fisher \(z\) transformation

\[ z_\ell = \frac{1}{2}\ln{\frac{1+\rho_\ell}{1-\rho_\ell}}\tag{5.1} \]

For large samples, the distribution of \(z_\ell\) is normal with variance \(\sigma^2 = 1/(n-3)\). It is straightforward to calculate the pooled correlation \(\bar z\) and its variance by Rubin’s rules. The result can be back-transformed by the inverse Fisher transformation

\[ \bar \rho = \frac{e^{2\bar z}-1}{e^{2\bar z}+1}\tag{5.2} \]

The confidence interval of \(\bar \rho\) is calculated in the \(z\)-scale as usual, and then back-transformed by Equation (5.2).

Table 5.2: Suggested transformations toward normality for various types of statistics. The transformed quantities can be pooled by Rubin’s rules.
Statistic Transformation Source
Correlation Fisher \(z\) Schafer (1997)
Odds ratio Logarithm Agresti (1990)
Relative risk Logarithm Agresti (1990)
Hazard ratio Logarithm Marshall, Billingham, and Bryan (2009)
Explained variance \(R^2\) Fisher \(z\) on root Harel (2009)
Survival probabilities Complementary log-log Marshall, Billingham, and Bryan (2009)
Survival distribution Logarithm Marshall, Billingham, and Bryan (2009)

Table 5.2 suggests transformations toward approximate normality for various types of statistics. There are quantities for which the distribution is complex or unknown. Examples include the Cramér \(C\) statistic (Brand 1999) and the discrimination index (Marshall, Billingham, and Bryan 2009). Ideally, the entire sampling distribution should be pooled in such cases, but the corresponding pooling methods have yet to be developed. The current advice is to search for ad hoc transformations to make the sampling distribution close to normality, and then apply Rubin’s rules.