Materials for RECAP workshop 4-5 Sept 2017
RECAP workshop: Statistical Methods for combined data sets ============================================================================================================
This site contains materials for the RECAP workshop Statistical Methods for combined data sets: Theory, techniques and tools on September 4-5, 2017 in Leiden.
Combining data sets generates blocks of missing data. However, most data analysis procedures are designed for complete data, and many will fail if the data contain missing values. Most procedures will therefore simply ignore any incomplete rows in the data, or revert to ad-hoc procedures like replacing missing values with some sort of “best value”. However, such fixes are based on assumptions, and may introduce serious biases when these assumptions are not met.
This workshop revises practical issues with combining data, and explores the use of multiple imputation as a principled solution.
The workshop consist of 6 sessions, each of which comprises a lecture followed by a computer practical using R
:
mice
Please remember to bring your own laptop computer and make sure that you have write-access to that machine (some corporate computers do not allow write access) or that you have the following software and packages pre-installed.
R
from the R-Project websiteRStudio Desktop (Free License)
from RStudio’s website. This is not necessary, per se, but it is highly recommended as RStudio
delivers a tremendous improvement to the user experience of base R
.markdown
, mice
, lme4
, dplyr
, plyr
and mlmRev
.RStudio
by navigating to Tools > Install Packages
in the upper menu and entering the names of the package into the Packages
field. Make sure that the button Install dependencies
is selected. Once done, click Install
and you’re all set.R
or RStudio
, copy, paste and enter the following code in the console window (by default the top-right window in RStudio
/ the only window in R
):install.packages(c("markdown", "mice", "lme4", "dplyr", "plyr", "mlmRev"))