Repeated Measurements and Imputation
Repeated Measurements and Imputation
Description:
Missing values are a big problem in many data sets, every subject and patient is valuable for a study. If too many subjects have to be excluded from an analysis due to missing values, the results of the study may be underpowered, especially in small studies.
There are currently different approaches to replace missing values by imputation. However, according to current knowledge, especially in the area of longitudinal data and in problems with multiple measurements, there are always problems to use the information available from preceding and subsequent measurements in such a way that an adequate variance estimation is guaranteed. One approach is simple regression, which estimates the missing value from preceding and subsequent values. However, this method has the disadvantage that in later calculations the variance of the imputed variables is underestimated, because the imputed values do not scatter but lie directly on the regression level. In this project, current methods and approaches for imputation in longitudinal data will be collected, presented and published in the form of a review. Subsequently, the most popular approaches will be tested in simulations with fictitious data sets with different sizes and different percentages of missing values. A new approach combining multiple imputation approaches with regression approaches and case-specific error terms will also be developed.
The quality of the imputation is checked by comparing the calculated parameters from the imputed data sets with the complete data set. The results of this work can help to better understand the errors and inaccuracies that arise from imputation. It will be examined which of the methods produces metrics that most closely match the values calculated from the original data set.