2.9 Exercises
Exercise 2.1 (Nomogram) Construct a graphic representation of Equation (2.27) that allows the user to convert \(\lambda\) and \(\gamma\) for different values of \(\nu\). What influence does \(\nu\) have on the relation between \(\lambda\) and \(\gamma\)?
Exercise 2.2 (Models) Explain the difference between the response model and the imputation model.
Exercise 2.3 (Listwise deletion) In the airquality
data, predict Ozone
from Wind
and Temp
. Now randomly delete the half of the wind data above 10 mph, and randomly delete half of the temperature data above 80\(^\circ\)F.
Are the data MCAR, MAR or MNAR?
Refit the model under listwise deletion. Do you notice a change in the estimates? What happens to the standard errors?
Would you conclude that listwise deletion provides valid results here?
- If you add a quadratic term to the model, would that alter your conclusion?
Exercise 2.4 (Number of imputations) Consider the nhanes
dataset in mice
.
Use the functions
ccn()
to calculate the number of complete cases. What percentage of the cases is incomplete?Impute the data with
mice
using the defaults withseed=1
, predictbmi
fromage
,hyp
andchl
by the normal linear regression model, and pool the results. What are the proportions of variance due to the missing data for each parameter? Which parameters appear to be most affected by the nonresponse?Repeat the analysis for
seed=2
andseed=3
. Do the conclusions remain the same?- Repeat the analysis with \(m=50\) with the same seeds. Would you prefer this analysis over those with \(m=5\)? Explain why.
Exercise 2.5 (Number of imputations (continued)) Continue with the data from the previous exercise.
Write an
R
function that automates the calculations of the previous exercise. Letseed
run from 1 to 100 and letm
take on valuesm = c(3, 5, 10, 20, 30, 40, 50, 100, 200)
.Plot the estimated proportions of explained variance due to missing data for the
age
-parameter against \(m\). Based on this graph, how many imputations would you advise?Check White’s conditions 1 and 2 (cf. Section 2.8). For which \(m\) do these conditions true?
- Does this also hold for categorical data? Use the
nhanes2
to study this.
Exercise 2.6 (Automated choice of \(m\)) Write an R
function that implements the methods discussed in Section 2.8.