Tools for analyzing multiple imputed datasets
John B. Carlin
Clinical Epidemiology and Biostatistics Unit
Murdoch Children's Research Institute and
University of Melbourne Department of Paediatrics
Royal Children's Hospital, Parkville, Victoria 3052, Australia
|
Ning Li
Clinical Epidemiology and Biostatistics Unit
Murdoch Children's Research Institute and
University of Melbourne Department of Paediatrics
Royal Children's Hospital, Parkville, Victoria 3052, Australia
|
Philip Greenwood
Clinical Epidemiology and Biostatistics Unit
Murdoch Children's Research Institute and
University of Melbourne Department of Paediatrics
Royal Children's Hospital, Parkville, Victoria 3052, Australia
|
Carolyn Coffey
Clinical Epidemiology and Biostatistics Unit
Murdoch Children's Research Institute and
University of Melbourne Department of Paediatrics
Royal Children's Hospital, Parkville, Victoria 3052, Australia
|
Abstract. The method of multiple imputation (MI) is used increasingly for analyzing
datasets with missing observations. Two sets of tasks are required in order
to implement the method: (a) generating multiple complete datasets in which
missing values have been imputed by simulating from an appropriate
probability distribution and (b) analyzing the multiple imputed datasets and
combining complete data inferences from them to form an overall inference
for parameters of interest. An increasing number of software tools are
available for task (a), although this is difficult to automate, because the
method of imputation should depend on the context and available covariate
data. When the quantity of missing data is not great, the sensitivity of
results to the imputation model may be relatively low. In this context,
software tools that enable task (b) to be performed with similar ease to the
analysis of a single dataset should facilitate the wider use of multiple
imputation. Such tools need not only to implement techniques for inference
from multiple imputed datasets but also to allow standard manipulations such
as transformation and recoding of variables. In this article, we describe a
set of Stata commands that we have developed for manipulating and analyzing
multiple datasets.
View all articles by these authors:
John B. Carlin, Ning Li, Philip Greenwood, Carolyn Coffey
View all articles with these keywords:
missing data, multiple imputation, Rubin's rule of combination, overall estimates
Download citation: BibTeX RIS
Download citation and abstract: BibTeX RIS
|