Skip to contents

The function categorize defines categories for variables in a data frame, starting with a user-defined index (e.g. 0 or 1). Continuous variables can be categorized by defining categories by discretizing the variables in different quantile groups.

The function decategorize does the reverse operation.

Usage

categorize(dat, categorical=NULL, quant=NULL, lowest=0)

decategorize(dat, categ_design=NULL)

Arguments

dat

Data frame

categorical

Vector with variable names which should be converted into categories, beginning with integer lowest

quant

Vector with number of classes for each variables. Variables are categorized among quantiles. The vector must have names containing variable names.

lowest

Lowest category index. Default is 0.

categ_design

Data frame containing informations about categorization which is the output of categorize.

Value

For categorize, it is a list with entries

data

Converted data frame

categ_design

Data frame containing some informations about categorization

For decategorize it is a data frame.

Examples

if (FALSE) {
library(mice)
library(miceadds)

#############################################################################
# EXAMPLE 1: Categorize questionnaire data
#############################################################################

data(data.smallscale, package="miceadds")
dat <- data.smallscale

# (0) select dataset
dat <- dat[, 9:20 ]
summary(dat)
categorical <- colnames(dat)[2:6]

# (1) categorize data
res <- sirt::categorize( dat, categorical=categorical )

# (2) multiple imputation using the mice package
dat2 <- res$data
VV <- ncol(dat2)
impMethod <- rep( "sample", VV )    # define random sampling imputation method
names(impMethod) <- colnames(dat2)
imp <- mice::mice( as.matrix(dat2), impMethod=impMethod, maxit=1, m=1 )
dat3 <- mice::complete(imp,action=1)

# (3) decategorize dataset
dat3a <- sirt::decategorize( dat3, categ_design=res$categ_design )

#############################################################################
# EXAMPLE 2: Categorize ordinal and continuous data
#############################################################################

data(data.ma01,package="miceadds")
dat <- data.ma01
summary(dat[,-c(1:2)] )

# define variables to be categorized
categorical <- c("books", "paredu" )
# define quantiles
quant <-  c(6,5,11)
names(quant) <- c("math", "read", "hisei")

# categorize data
res <- sirt::categorize( dat, categorical=categorical, quant=quant)
str(res)
}