mice.impute.ml.lmer.Rd
This function is a general imputation function based on the linear mixed effects
model as implemented in lme4::lmer
. The imputation model can be hierarchical
or non-hierarchical and can be written in a general form
\(\bold{y}=\bold{X} \bold{\beta} + \sum_{v=1}^V \bold{Z}_v \bold{u}_v\) for \(V\)
multivariate random effects. While predictors can be selected by specifying the rows
in the predictor matrix in mice::mice
(i.e., modification of type
),
the level of random effects can be specified with levels_id
and random slopes
can be selected with random_slopes
.
The function mice.impute.ml.lmer
allows the imputation of variables at
arbitrary levels. The corresponding level can be specified with levels_id
.
All predictor variables are aggregated to the corresponding level of the variable
to be imputed.
Several strategies for the specification of the design
matrix \(\bold{X}\) are accommodated. By default, predictors at a lower level
are automatically aggregated to the higher level and included as further
predictors to maintain the multilevel structure in the data (Grund, Luedtke & Robitzsch,
2018; Enders, Mistler & Keller, 2016; argument aggregate_automatically=TRUE
). Further,
interactions and quadratic effects can be defined by respective arguments
interactions
and quadratics
. The dimension
of the matrix of predictors can be reduced by applying partial least squares regression,
see mice.impute.pls
.
The function now only allows continuous data (model="continuous"
),
ordinal data (model="pmm"
) or
binary data (model="pmm"
or model="binary"
). Nominal variables with
missing values cannot (yet) be handled.
mice.impute.ml.lmer(y, ry, x, type, levels_id, variables_levels=NULL,
random_slopes=NULL, aggregate_automatically=TRUE, intercept=TRUE,
groupcenter.slope=FALSE, draw.fixed=TRUE, random.effects.shrinkage=1e-06,
glmer.warnings=TRUE, model="continuous", donors=3, match_sampled_pars=FALSE,
blme_use=FALSE, blme_args=NULL, pls.facs=0, interactions=NULL,
quadratics=NULL, min.int.cor=0, min.all.cor=0, pls.print.progress=FALSE,
group_index=NULL, iter_re=0, ...)
Incomplete data vector of length n
Vector of missing data pattern (FALSE
-- missing,
TRUE
-- observed)
Matrix (n
\(\times\) p
) of complete predictors.
Predictor variables associated with fixed effects.
Specification of the level identifiers (see Examples)
Specification of the level of variables (see Examples)
Specification of random slopes (see Examples)
Logical indicating whether aggregated effects at higher levels are automatically included.
Optional logical indicating whether the intercept should be included.
Optional logical indicating whether covariates should be centered around group means
Optional logical indicating whether fixed effects parameter should be randomly drawn
Shrinkage parameter for stabilizing the covariance matrix of random effects
Optional logical indicating whether warnings from
glmer
should be displayed
Type of model. Can be "continuous"
for normally distributed data,
"binary"
for dichotomous data specifying a logistic mixed effects model
and "pmm"
for predictive mean matching.
Number of donors used for predictive mean matching
Logical indicating whether values of nearest neighbors should also be sampled in pmm imputation.
Logical indicating whether the blme package should be used.
(Prior) Arguments for blme, see
blme::blmer
and
blme::bmerDist-class
.
Number of factors used in PLS dimension reduction
Specification of predictors with interaction effects
Specification of predictors with quadratic effects
Minimum absolute value of correlation with outcome for interaction effects to be retained
Minimum absolute value of correlation with outcome for predictors to be retained
Logical indicating whether progress of algorithm should be displayed
Optional vector for group identifiers (internally used
in mice.impute.bygroup
Number of iterations for sampling random effects in random intercept
models for continuous outcomes. Using iter_re>0
is necessary for cross-classified
models with not fully balanced designs.
Further arguments to be passed
Vector of imputed values
Enders, C. K., Mistler, S. A., & Keller, B. T. (2016). Multilevel multiple imputation: A review and evaluation of joint modeling and chained equations imputation. Psychological Methods, 21(2), 222-240. doi:10.1037/met0000063
Grund, S., Luedtke, O., & Robitzsch, A. (2018). Multiple imputation of multilevel data in organizational research. Organizational Research Methods, 21(1), 111-149. doi:10.1177/1094428117703686
See mice.impute.2l.continuous
for two-level imputation in mice and
for several links to other packages which enable multilevel imputation.
if (FALSE) {
#############################################################################
# EXAMPLE 1: Imputation of three-level data with normally distributed residuals
#############################################################################
data(data.ma07, package="miceadds")
dat <- data.ma07
# variables at level 1 (identifier id1): x1 (some missings), x2 (complete)
# variables at level 2 (identifier id2): y1 (some missings), y2 (complete)
# variables at level 3 (identifier id3): z1 (some missings), z2 (complete)
#****************************************************************************
# Imputation model 1
#----- specify levels of variables (only relevent for variables
# with missing values)
variables_levels <- miceadds:::mice_imputation_create_type_vector( colnames(dat), value="")
# leave variables at lowest level blank (i.e., "")
variables_levels[ c("y1","y2") ] <- "id2"
variables_levels[ c("z1","z2") ] <- "id3"
#----- specify predictor matrix
predmat <- mice::make.predictorMatrix(data=dat)
predmat[, c("id2", "id3") ] <- 0
# set -2 for cluster identifier for level 3 variable z1
# because "2lonly" function is used
predmat[ "z1", "id3" ] <- -2
#----- specify imputation methods
method <- mice::make.method(data=dat)
method[c("x1","y1")] <- "ml.lmer"
method[c("z1")] <- "2lonly.norm"
#----- specify hierarchical structure of imputation models
levels_id <- list()
#** hierarchical structure for variable x1
levels_id[["x1"]] <- c("id2", "id3")
#** hierarchical structure for variable y1
levels_id[["y1"]] <- c("id3")
#----- specify random slopes
random_slopes <- list()
#** random slopes for variable x1
random_slopes[["x1"]] <- list( "id2"=c("x2"), "id3"=c("y1") )
# if no random slopes should be specified, the corresponding entry can be left empty
# and only a random intercept is used in the imputation model
#----- imputation in mice
imp1 <- mice::mice( dat, maxit=10, m=5, method=method,
predictorMatrix=predmat, levels_id=levels_id, random_slopes=random_slopes,
variables_levels=variables_levels )
summary(imp1)
#****************************************************************************
# Imputation model 2
#----- impute x1 with predictive mean matching and y1 with normally distributed residuals
model <- list(x1="pmm", y1="continuous")
#----- assume only random intercepts
random_slopes <- NULL
#---- create interactions with z2 for all predictors in imputation models for x1 and y1
interactions <- list("x1"="z2", "y1"="z2")
#----- imputation in mice
imp2 <- mice::mice( dat, method=method, predictorMatrix=predmat,
levels_id=levels_id, random_slopes=random_slopes,
variables_levels=variables_levels, model=model, interactions=interactions)
summary(imp2)
}