Imputation of a Continuous or a Binary Variable From a Two-Level Regression Model using lme4 or blme

The function mice.impute.2l.continuous imputes values of continuous variables with a linear mixed effects model using lme4::lmer or blme::blmer. The lme4::lmer or blme::blmer function is also used for predictive mean matching where the match is based on predicted values which contain the fixed and (sampled) random effects. Binary variables can be imputed from a two-level logistic regression model fitted with the lme4::glmer or blme::bglmer function. See Snijders and Bosker (2012) and Zinn (2013) for details.

mice.impute.2l.continuous(y, ry, x, type, intercept=TRUE,
    groupcenter.slope=FALSE, draw.fixed=TRUE, random.effects.shrinkage=1E-6,
    glmer.warnings=TRUE, blme_use=FALSE, blme_args=NULL, ... )

mice.impute.2l.pmm(y, ry, x, type, intercept=TRUE,
    groupcenter.slope=FALSE, draw.fixed=TRUE, random.effects.shrinkage=1E-6,
    glmer.warnings=TRUE, donors=5, match_sampled_pars=TRUE,
    blme_use=FALSE, blme_args=NULL, ... )

mice.impute.2l.binary(y, ry, x, type, intercept=TRUE,
    groupcenter.slope=FALSE, draw.fixed=TRUE, random.effects.shrinkage=1E-6,
    glmer.warnings=TRUE, blme_use=FALSE, blme_args=NULL, ... )

Arguments

y: Incomplete data vector of length n
ry: Vector of missing data pattern (FALSE -- missing, TRUE -- observed)
x: Matrix (n x p) of complete predictors.
type: Type of predictor variable. The cluster identifier has type -2, fixed effects predictors without a random slope type 1 and predictors with fixed effects and random effects have type 2. If the cluster mean should be included for a covariate, 3 should be chosen. The specification 4 includes the cluster mean, the fixed effect and the random effect.
intercept: Optional logical indicating whether the intercept should be included.
groupcenter.slope: Optional logical indicating whether covariates should be centered around group means
draw.fixed: Optional logical indicating whether fixed effects parameter should be randomly drawn
random.effects.shrinkage: Shrinkage parameter for stabilizing the covariance matrix of random effects
glmer.warnings: Optional logical indicating whether warnings from glmer should be displayed
blme_use: Logical indicating whether the blme package should be used.
blme_args: (Prior) Arguments for blme, see blme::blmer and blme::bmerDist-class.
donors: Number of donors used for predictive mean matching
match_sampled_pars: Logical indicating whether values of nearest neighbors should also be sampled in pmm imputation.
...: Further arguments to be passed

Value

A vector of length nmis=sum(!ry) with imputed values.

References

Snijders, T. A. B., & Bosker, R. J. (2012). Multilevel analysis: An introduction to basic and advanced multilevel modeling. Thousand Oaks, CA: Sage.

Zinn, S. (2013). An imputation model for multilevel binary data. NEPS Working Paper No 31.

Variables at a higher level (e.g. at level 2) can be imputed using 2lonly functions, for example the mice::mice.impute.2lonly.norm function in the mice package or the general mice.impute.2lonly.function function in the miceadds package which using an already defined imputation method at level 1. If a level-2 variable for 3-level data should be imputed, then mice.impute.ml.lmer can also be used to impute this variable with a two-level imputation model in which level 1 corresponds to the original level-2 units and level 2 corresponds to the original level-3 units.

See mice::mice.impute.2l.norm and mice::mice.impute.2l.pan for imputation functions in the mice package under fully conditional specification for normally distributed variables. The function mice::mice.impute.2l.norm allows for residual variances which are allowed to vary across groups while mice::mice.impute.2l.pan assumes homogeneous residual variances.

The micemd package provides further imputation methods for the mice package for imputing multilevel data with fully conditional specification. The function micemd::mice.impute.2l.jomo has similar functionality like mice::mice.impute.2l.pan and imputes normally distributed two-level data with a Bayesian MCMC approach, but relies on the jomo package instead of the pan package. The functions mice::mice.impute.2l.lmer and micemd::mice.impute.2l.glm.norm have similar functionality like mice.impute.2l.continuous and imputes normally distributed two-level data. The function {micemd::mice.impute.2l.glm.bin} has similar functionality like mice.impute.2l.binary and imputes binary two-level data.

The hmi package imputes single-level and multilevel data and is also based on fully conditional specification. The package relies on the MCMC estimation implemented in the MCMCglmm package. The imputation procedure can be run with the hmi::hmi function.

See the pan (pan::pan) and the jomo (jomo::jomo) package for joint multilevel imputation. See mitml::panImpute and mitml::jomoImpute for wrapper functions to these packages in the mitml package.

Imputation by chained equations can also be conducted in blocks of multivariate conditional distributions since mice 3.0.0 (see the blocks argument in mice::mice). The mitml::panImpute and mitml::jomoImpute functions can be used with mice::mice by specifying imputation methods "panImpute" (see mice::mice.impute.panImpute)) and "jomoImpute" (see mice::mice.impute.jomoImpute)).

Examples

if (FALSE) {
#############################################################################
# EXAMPLE 1: Imputation of a binary variable
#############################################################################

#--- simulate missing values
set.seed(976)
G <- 30        # number of groups
n <- 8        # number of persons per group
iccx <- .2    # intra-class correlation X
iccy <- .3    # latent intra-class correlation binary outcome
bx <- .4    # regression coefficient
threshy <- stats::qnorm(.70)  # threshold for y
x <- rep( rnorm( G, sd=sqrt( iccx) ), each=n )  +
            rnorm(G*n, sd=sqrt( 1 - iccx) )
y <- bx * x + rep( rnorm( G, sd=sqrt( iccy) ), each=n )  +
                rnorm(G*n, sd=sqrt( 1 - iccy) )
y <- 1 * ( y > threshy )
dat <- data.frame( group=100+rep(1:G, each=n), x=x, y=y )

#* create some missings
dat1 <- dat
dat1[ seq( 1, G*n, 3 ),"y" ]  <- NA
dat1[ dat1$group==2, "y" ] <- NA

#--- prepare imputation in mice
vars <- colnames(dat1)
V <- length(vars)
#* predictor matrix
predmat <- matrix( 0, nrow=V, ncol=V)
rownames(predmat) <- colnames(predmat) <- vars
predmat["y", ] <- c(-2,2,0)
#* imputation methods
impmeth <- rep("",V)
names(impmeth) <- vars
impmeth["y"] <- "2l.binary"

#** imputation with logistic regression ('2l.binary')
imp1 <- mice::mice( data=as.matrix(dat1), method=impmeth,
                predictorMatrix=predmat, maxit=1, m=5 )

#** imputation with predictive mean matching ('2l.pmm')
impmeth["y"] <- "2l.pmm"
imp2 <- mice::mice( data=as.matrix(dat1), method=impmeth,
                predictorMatrix=predmat, maxit=1, m=5 )

#** imputation with logistic regression using blme package
blme_args <- list( "cov.prior"="invwishart")
imp3 <- mice::mice( data=as.matrix(dat1), method=impmeth,
                predictorMatrix=predmat, maxit=1, m=5,
                blme_use=TRUE, blme_args=blme_args )
}

Imputation of a Continuous or a Binary Variable From a Two-Level Regression Model using lme4 or blme

Arguments

Value

References

See also

Examples