Unidimensional Non- and Semiparametric Item Response Model

Conducts non- and semiparametric estimation of a unidimensional item response model for a single group allowing polytomous item responses (Rossi, Wang & Ramsay, 2002).

For dichotomous data, the function also allows group lasso penalty (penalty_type="lasso"; Breheny & Huang, 2015; Yang & Zhou, 2015) and a ridge penalty (penalty_type="ridge"; Rossi et al., 2002) which is applied to the nonlinear part of the basis expansion. This approach automatically detects deviations from a 2PL or a 1PL model (see Examples 2 and 3). See Details for model specification.

tam.np( dat, probs_init=NULL, pweights=NULL, lambda=NULL, control=list(),
    model="2PL", n_basis=0, basis_type="hermite", penalty_type="lasso",
    pars_init=NULL, orthonormalize=TRUE)

# S3 method for tam.np
summary(object, file=NULL, ...)

# S3 method for tam.np
IRT.cv(object, kfold=10, ...)

Arguments

dat: Matrix of integer item responses (starting from zero)
probs_init: Array containing initial probabilities
pweights: Optional vector of person weights
lambda: Numeric or vector of regularization parameter
control: List of control arguments, see tam.mml.
model: Specified target model. Can be "2PL" or "1PL".
n_basis: Number of basis functions
basis_type: Type of basis function: "bspline" for B-splines or "hermite" for Gauss-Hermite polynomials
penalty_type: Lasso type penalty ("lasso") or ridge penalty ("ridge")
pars_init: Optional matrix of initial item parameters
orthonormalize: Logical indicating whether basis functions should be orthonormalized
object: Object of class tam.np
file: Optional file name for summary output
kfold: Number of folds in $k$-fold cross-validation
...: Further arguments to be passed

Details

The basis expansion approach is applied for the logit transformation of item response functions for dichotomous data. In more detail, it this assumed that $$P(X_i=1|\theta)=\psi( H_0(\theta) + H_1(\theta)$$ where $H_0$ is the target function type and $H_1$ is the semiparametric part which parameterizes model deviations. For the 2PL model (model="2PL") it is $H_0(\theta)=d_i + a_i \theta $ and for the 1PL model (model="1PL") we set $H_1(\theta)=d_i + 1 \cdot \theta $. The model discrepancy is specified as a basis expansion approach $$H_1 ( \theta )=\sum_{h=1}^p \beta_{ih} f_h( \theta)$$ where $f_h$ are basis functions (possibly orthonormalized) and $\beta_{ih}$ are item parameters which should be estimated. Penalty functions are posed on the $\beta_{ih}$ coefficients. For the group lasso penalty, we specify the penalty $J_{i,L1}=N \lambda \sqrt{p} \sqrt{ \sum_{h=1}^p \beta_{ih}^2 }$ while for the ridge penalty it is $J_{i,L2}=N \lambda \sum_{h=1}^p \beta_{ih}^2 $ ($N$ denoting the sample size).

Value

List containing several entries

rprobs: Item response probabilities
theta: Used nodes for approximation of $\theta$ distribution
n.ik: Expected counts
like: Individual likelihood
hwt: Individual posterior
item: Summary item parameter table
pars: Estimated parameters
regularized: Logical indicating which items are regularized
ic: List containing
...: Further values

References

Breheny, P., & Huang, J. (2015). Group descent algorithms for nonconvex penalized linear and logistic regression models with grouped predictors. Statistics and Computing, 25(2), 173-187. doi:10.1007/s11222-013-9424-2

Rossi, N., Wang, X., & Ramsay, J. O. (2002). Nonparametric item response function estimates with the EM algorithm. Journal of Educational and Behavioral Statistics, 27(3), 291-317. doi:10.3102/10769986027003291

Yang, Y., & Zou, H. (2015). A fast unified algorithm for solving group-lasso penalized learning problems. Statistics and Computing, 25(6), 1129-1141. doi:10.1007/s11222-014-9498-5

Examples

if (FALSE) {
#############################################################################
# EXAMPLE 1: Nonparametric estimation polytomous data
#############################################################################

data(data.cqc02, package="TAM")
dat <- data.cqc02

#** nonparametric estimation
mod <- TAM::tam.np(dat)

#** extractor functions for objects of class 'tam.np'
lmod <- IRT.likelihood(mod)
pmod <- IRT.posterior(mod)
rmod <- IRT.irfprob(mod)
emod <- IRT.expectedCounts(mod)

#############################################################################
# EXAMPLE 2: Semiparametric estimation and detection of item misfit
#############################################################################

#- simulate data with two misfitting items
set.seed(998)
I <- 10
N <- 1000
a <- stats::rnorm(I, mean=1, sd=.3)
b <- stats::rnorm(I, mean=0, sd=1)
dat <- matrix(NA, nrow=N, ncol=I)
colnames(dat) <- paste0("I",1:I)
theta <- stats::rnorm(N)
for (ii in 1:I){
    dat[,ii] <- 1*(stats::runif(N) < stats::plogis( a[ii]*(theta-b[ii] ) ))
}

#* first misfitting item with lower and upper asymptote
ii <- 1
l <- .3
u <- 1
b[ii] <- 1.5
dat[,ii] <- 1*(stats::runif(N) < l + (u-l)*stats::plogis( a[ii]*(theta-b[ii] ) ))

#* second misfitting item with non-monotonic item response function
ii <- 3
dat[,ii] <- (stats::runif(N) < stats::plogis( theta-b[ii]+.6*theta^2))

#- 2PL model
mod0 <- TAM::tam.mml.2pl(dat)

#- lasso penalty with lambda of .05
mod1 <- TAM::tam.np(dat, n_basis=4, lambda=.05)

#- lambda value of .03 using starting value of previous model
mod2 <- TAM::tam.np(dat, n_basis=4, lambda=.03, pars_init=mod1$pars)
cmod2 <- TAM::IRT.cv(mod2)  # cross-validated deviance

#- lambda=.015
mod3 <- TAM::tam.np(dat, n_basis=4, lambda=.015, pars_init=mod2$pars)
cmod3 <- TAM::IRT.cv(mod3)

#- lambda=.007
mod4 <- TAM::tam.np(dat, n_basis=4, lambda=.007, pars_init=mod3$pars)

#- lambda=.001
mod5 <- TAM::tam.np(dat, n_basis=4, lambda=.001, pars_init=mod4$pars)

#- final estimation using solution of mod3
eps <- .0001
lambda_final <- eps+(1-eps)*mod3$regularized   # lambda parameter for final estimate
mod3b <- TAM::tam.np(dat, n_basis=4, lambda=lambda_final, pars_init=mod3$pars)
summary(mod1)
summary(mod2)
summary(mod3)
summary(mod3b)
summary(mod4)

# compare models with respect to information criteria
IRT.compareModels(mod0, mod1, mod2, mod3, mod3b, mod4, mod5)

#-- compute item fit statistics RISE
# regularized solution
TAM::IRT.RISE(mod_p=mod1, mod_np=mod3)
# regularized solution, final estimation
TAM::IRT.RISE(mod_p=mod1, mod_np=mod3b, use_probs=TRUE)
TAM::IRT.RISE(mod_p=mod1, mod_np=mod3b, use_probs=FALSE)
# use TAM::IRT.RISE() function for computing the RMSD statistic
TAM::IRT.RISE(mod_p=mod1, mod_np=mod1, use_probs=FALSE)

#############################################################################
# EXAMPLE 3: Mixed 1PL/2PL model
#############################################################################

#* simulate data with 2 2PL items and 8 1PL items
set.seed(9877)
N <- 2000
I <- 10
b <- seq(-1,1,len=I)
a <- rep(1,I)
a[c(3,8)] <- c(.5, 2)
theta <- stats::rnorm(N, sd=1)
dat <- sirt::sim.raschtype(theta, b=b, fixed.a=a)

#- 1PL model
mod1 <- TAM::tam.mml(dat)
#- 2PL model
mod2 <- TAM::tam.mml.2pl(dat)
#- 2PL model with penalty on slopes
mod3 <- TAM::tam.np(dat, lambda=.04, model="1PL", n_basis=0)
summary(mod3)
#- final mixed 1PL/2PL model
lambda <- 1*mod3$regularized
mod4 <- TAM::tam.np(dat, lambda=lambda, model="1PL", n_basis=0)
summary(mod4)

IRT.compareModels(mod1, mod2, mod3, mod4)
}