data.allison.RdDatasets from Allison's missing data book (Allison 2002).
Data data.allison.gssexp:
'data.frame': 2991 obs. of 14 variables: $ AGE : num 33 59 NA 59 21 22 40 25 41 45 ... $ EDUC : num 12 12 12 8 13 15 9 12 12 12 ... $ FEMALE : num 1 0 1 0 1 1 1 0 1 1 ... $ SPANKING: num 1 1 2 2 NA 1 3 1 1 NA ... $ INCOM : num 11.2 NA 16.2 18.8 13.8 ... $ NOCHILD : num 0 0 0 0 1 1 0 0 0 0 ... $ NODOUBT : num NA NA NA 1 NA NA 1 NA NA 1 ... $ NEVMAR : num 0 0 0 0 1 1 0 1 0 0 ... $ DIVSEP : num 1 0 0 0 0 0 0 0 0 1 ... $ WIDOW : num 0 0 0 0 0 0 1 0 1 0 ... $ BLACK : num 1 1 1 0 1 1 0 1 1 1 ... $ EAST : num 1 1 1 1 1 1 1 1 1 1 ... $ MIDWEST : num 0 0 0 0 0 0 0 0 0 0 ... $ SOUTH : num 0 0 0 0 0 0 0 0 0 0 ...
Data data.allison.hip:
'data.frame': 880 obs. of 7 variables: $ SID : num 1 1 1 1 2 2 2 2 9 9 ... $ WAVE: num 1 2 3 4 1 2 3 4 1 2 ... $ ADL : num 3 2 3 3 3 1 2 1 3 3 ... $ PAIN: num 0 5 0 0 0 1 5 NA 0 NA ... $ SRH : num 2 4 2 2 4 1 1 2 2 3 ... $ WALK: num 1 0 0 0 0 0 0 0 1 NA ... $ CESD: num 9 28 31 11.6 NA ...
Data data.allison.usnews:
'data.frame': 1302 obs. of 7 variables: $ CSAT : num 972 961 NA 881 NA ... $ ACT : num 20 22 NA 20 17 20 21 NA 24 26 ... $ STUFAC : num 11.9 10 9.5 13.7 14.3 32.8 18.9 18.7 16.7 14 ... $ GRADRAT: num 15 NA 39 NA 40 55 51 15 69 72 ... $ RMBRD : num 4.12 3.59 4.76 5.12 2.55 ... $ PRIVATE: num 1 0 0 0 0 1 0 0 0 1 ... $ LENROLL: num 4.01 6.83 4.49 7.06 6.89 ...
The datasets were downloaded from http://www.ats.ucla.edu/stat/examples/md/.
Allison, P. D. (2002). Missing data. Newbury Park, CA: Sage.
if (FALSE) {
#############################################################################
# EXAMPLE 1: Hip dataset | Imputation using a wide format
#############################################################################
# at first, the hip dataset is 'melted' for imputation
data(data.allison.hip)
## head(data.allison.hip)
## SID WAVE ADL PAIN SRH WALK CESD
## 1 1 1 3 0 2 1 9.000
## 2 1 2 2 5 4 0 28.000
## 3 1 3 3 0 2 0 31.000
## 4 1 4 3 0 2 0 11.579
## 5 2 1 3 0 4 0 NA
## 6 2 2 1 1 1 0 2.222
library(reshape)
hip.wide <- reshape::reshape(data.allison.hip, idvar="SID", timevar="WAVE",
direction="wide")
## > head(hip.wide, 2)
## SID ADL.1 PAIN.1 SRH.1 WALK.1 CESD.1 ADL.2 PAIN.2 SRH.2 WALK.2 CESD.2 ADL.3
## 1 1 3 0 2 1 9 2 5 4 0 28.000 3
## 5 2 3 0 4 0 NA 1 1 1 0 2.222 2
## PAIN.3 SRH.3 WALK.3 CESD.3 ADL.4 PAIN.4 SRH.4 WALK.4 CESD.4
## 1 0 2 0 31 3 0 2 0 11.579
## 5 5 1 0 12 1 NA 2 0 NA
# imputation of the hip wide dataset
imp <- mice::mice( as.matrix( hip.wide[,-1] ), m=5, maxit=3 )
summary(imp)
}