Imputes a categorical variable using multivariate predictive mean matching.

mice.impute.catpmm(y, ry, x, donors=5, ridge=10^(-5), ...)

Arguments

y

Incomplete data vector of length n

ry

Vector of missing data pattern (FALSE -- missing, TRUE -- observed)

x

Matrix (n x p) of complete covariates.

donors

Number of donors used for random sampling of nearest neighbors in imputation

ridge

Numerical constant used for avioding collinearity issues. Noise is added to covariates.

...

Further arguments to be passed

Details

The categorical outcome variable is recoded as a vector of dummy variables. A multivariate linear regression is specified for computing predicted values. The L1 distance (i.e., sum of absolute deviations) is utilized for predictive mean matching. Predictive mean matching for categorical variables has been proposed by Meinfelder (2009) using a multinomial regression instead of ordinary linear regression.

Value

A vector of length nmis=sum(!ry) with imputed values.

References

Meinfelder, F. (2009). Analysis of Incomplete Survey Data - Multiple Imputation via Bayesian Bootstrap Predictive Mean Matching. Dissertation thesis. University of Bamberg, Germany. https://fis.uni-bamberg.de/handle/uniba/213

Examples

if (FALSE) {
#############################################################################
# EXAMPLE 1: Imputation internat data
#############################################################################

data(data.internet, package="miceadds")
dat <- data.internet

#** empty imputation
imp0 <- mice::mice(dat, m=1, maxit=0)
method <- imp0$method
predmat <- imp0$predictorMatrix

#** define factor variable

dat1 <- dat
dat1[,1] <- as.factor(dat1[,1])
method[1] <- "catpmm"

#** impute with 'catpmm''
imp <- mice::mice(dat1, method=method1, m=5)
summary(imp)
}