2.9 Exercises
Exercise 2.1 (Nomogram) Construct a graphic representation of Equation (2.27) that allows the user to convert \(\lambda\) and \(\gamma\) for different values of \(\nu\). What influence does \(\nu\) have on the relation between \(\lambda\) and \(\gamma\)?
Exercise 2.2 (Models) Explain the difference between the response model and the imputation model.
Exercise 2.3 (Listwise deletion) In the airquality data, predict Ozone from Wind and Temp. Now randomly delete the half of the wind data above 10 mph, and randomly delete half of the temperature data above 80\(^\circ\)F.
Are the data MCAR, MAR or MNAR?
Refit the model under listwise deletion. Do you notice a change in the estimates? What happens to the standard errors?
Would you conclude that listwise deletion provides valid results here?
- If you add a quadratic term to the model, would that alter your conclusion?
Exercise 2.4 (Number of imputations) Consider the nhanes dataset in mice.
Use the functions
ccn()to calculate the number of complete cases. What percentage of the cases is incomplete?Impute the data with
miceusing the defaults withseed=1, predictbmifromage,hypandchlby the normal linear regression model, and pool the results. What are the proportions of variance due to the missing data for each parameter? Which parameters appear to be most affected by the nonresponse?Repeat the analysis for
seed=2andseed=3. Do the conclusions remain the same?- Repeat the analysis with \(m=50\) with the same seeds. Would you prefer this analysis over those with \(m=5\)? Explain why.
Exercise 2.5 (Number of imputations (continued)) Continue with the data from the previous exercise.
Write an
Rfunction that automates the calculations of the previous exercise. Letseedrun from 1 to 100 and letmtake on valuesm = c(3, 5, 10, 20, 30, 40, 50, 100, 200).Plot the estimated proportions of explained variance due to missing data for the
age-parameter against \(m\). Based on this graph, how many imputations would you advise?Check White’s conditions 1 and 2 (cf. Section 2.8). For which \(m\) do these conditions true?
- Does this also hold for categorical data? Use the
nhanes2to study this.
Exercise 2.6 (Automated choice of \(m\)) Write an R function that implements the methods discussed in Section 2.8.