Broken stick model for irregular longitudinal data


Many longitudinal studies collect data that have irregular observation times, often requiring the application of linear mixed models with time-varying outcomes. This paper presents an alternative that splits the quantitative analysis into two steps. The first step converts irregularly observed data into a set of repeated measures through the broken stick model. The second step estimates the parameters of scientific interest from the repeated measurements at the subject level. The broken stick model approximates each subject’s trajectory by a series of connected straight lines. The breakpoints, specified by the user, divide the time axis into consecutive intervals common to all subjects. We restrict the methodology to just three variables: time, measurement and subject. The model is a special case of the linear mixed model, with time as a linear B-spline and with subject as the grouping factor. The main assumptions are: Subjects are exchangeable, trajectories between two breakpoints are all straight, random effects follow a multivariate normal distribution, and unobserved data are missing at random (MAR). The pkgbrokenstick R package offers tools to calculate, predict, impute and visualise broken stick estimates. The package supports two optimisation methods, including options to constrain the variance-covariance matrix of the random effects. We demonstrate a few applications of the model: detection of critical periods, estimation of the time-to-time correlations, profile analysis, curve interpolation, multiple imputation and personalised prediction of future outcomes by curve matching.

Journal of Statistcal Software
Stef van Buuren
Stef van Buuren

My research interests include data science, missing data, child growth and development, and measurement.