Skip to contents

This function employs marginal maximum likelihood estimation of item response models for dichotomous data. First, the Rasch type model (generalized item response model) can be estimated. The generalized logistic link function (Stukel, 1988) can be estimated or fixed for conducting IRT with different link functions than the logistic one. The Four-Parameter logistic item response model is a special case of this model (Loken & Rulison, 2010). Second, Ramsay's quotient model (Ramsay, 1989) can be estimated by specifying irtmodel="ramsay.qm". Third, quite general item response functions can be estimated in a nonparametric framework (Rossi, Wang & Ramsay, 2002). Fourth, pseudo-likelihood estimation for fractional item responses can be conducted for Rasch type models. Fifth, a simple two-dimensional missing data item response model (irtmodel='missing1'; Mislevy & Wu, 1996) can be estimated.

See Details for more explanations.

Usage

rasch.mml2( dat, theta.k=seq(-6,6,len=21), group=NULL, weights=NULL,
   constraints=NULL, glob.conv=10^(-5), parm.conv=10^(-4), mitermax=4,
   mmliter=1000, progress=TRUE,  fixed.a=rep(1,ncol(dat)),
   fixed.c=rep(0,ncol(dat)), fixed.d=rep(1,ncol(dat)),
   fixed.K=rep(3,ncol(dat)), b.init=NULL, est.a=NULL, est.b=NULL,
   est.c=NULL, est.d=NULL, min.b=-99, max.b=99, min.a=-99, max.a=99,
   min.c=0, max.c=1, min.d=0, max.d=1, prior.b=NULL, prior.a=NULL, prior.c=NULL,
   prior.d=NULL, est.K=NULL, min.K=1, max.K=20, min.delta=-20, max.delta=20,
   beta.init=NULL, min.beta=-8, pid=1:(nrow(dat)), trait.weights=NULL,  center.trait=TRUE,
   center.b=FALSE, alpha1=0, alpha2=0,est.alpha=FALSE, equal.alpha=FALSE,
   designmatrix=NULL, alpha.conv=parm.conv, numdiff.parm=0.00001,
   numdiff.alpha.parm=numdiff.parm, distribution.trait="normal", Qmatrix=NULL,
   variance.fixed=NULL, variance.init=NULL,
   mu.fixed=cbind(seq(1,ncol(Qmatrix)),rep(0,ncol(Qmatrix))),
   irtmodel="raschtype", npformula=NULL, npirt.monotone=TRUE,
   use.freqpatt=is.null(group), delta.miss=0, est.delta=rep(NA,ncol(dat)),
   nimps=0, ... )

# S3 method for rasch.mml
summary(object, file=NULL, ...)

# S3 method for rasch.mml
plot(x, items=NULL, xlim=NULL, main=NULL, ...)

# S3 method for rasch.mml
anova(object,...)

# S3 method for rasch.mml
logLik(object,...)

# S3 method for rasch.mml
IRT.irfprob(object,...)

# S3 method for rasch.mml
IRT.likelihood(object,...)

# S3 method for rasch.mml
IRT.posterior(object,...)

# S3 method for rasch.mml
IRT.modelfit(object,...)

# S3 method for rasch.mml
IRT.expectedCounts(object,...)

# S3 method for IRT.modelfit.rasch.mml
summary(object,...)

Arguments

dat

An \(N \times I\) data frame of dichotomous item responses.
For the missing data item response model (irtmodel='missing1'), code item responses by 9 which should be treated by the missing data model. Other missing responses can be coded by NA.

theta.k

Optional vector of discretized theta values. For multidimensional IRT models with \(D\) dimensions, it is a matrix with \(D\) columns.

group

Vector of integers with group identifiers in multiple group estimation. The multiple group does not work for irtmodel="missing1".

weights

Optional vector of person weights (sample weights).

constraints

Constraints on b parameters (item difficulties). It must be a matrix with two columns: the first column contains item names, the second column fixed parameter values.

glob.conv

Convergence criterion for deviance

parm.conv

Convergence criterion for item parameters

mitermax

Maximum number of iterations in M step. This argument does only apply for the estimation of the \(b\) parameters.

mmliter

Maximum number of iterations

progress

Should progress be displayed at the console?

fixed.a

Fixed or initial \(a\) parameters

fixed.c

Fixed or initial \(c\) parameters

fixed.d

Fixed or initial \(d\) parameters

fixed.K

Fixed or initial \(K\) parameters in Ramsay's quotient model.

b.init

Initial \(b\) parameters

est.a

Vector of integers which indicate which \(a\) parameters should be estimated. Equal integers correspond to the same estimated parameters.

est.b

Vector of integers which indicate which \(b\) parameters should be estimated. Equal integers correspond to the same estimated parameters.

est.c

Vector of integers which indicate which \(c\) parameters should be estimated. Equal integers correspond to the same estimated parameters.

est.d

Vector of integers which indicate which \(d\) parameters should be estimated. Equal integers correspond to the same estimated parameters.

min.b

Minimal \(b\) parameter to be estimated

max.b

Maximal \(b\) parameter to be estimated

min.a

Minimal \(a\) parameter to be estimated

max.a

Maximal \(a\) parameter to be estimated

min.c

Minimal \(c\) parameter to be estimated

max.c

Maximal \(c\) parameter to be estimated

min.d

Minimal \(d\) parameter to be estimated

max.d

Maximal \(d\) parameter to be estimated

prior.b

Optional prior distribution for \(b\) parameters: \(N(\mu, \sigma)\). Input is a vector of length two with parameters \(\mu\) and \(\sigma\).

prior.a

Optional prior distribution for \(a\) parameters: \(N(\mu, \sigma)\). Input is a vector of length two with parameters \(\mu\) and \(\sigma\).

prior.c

Optional prior distribution for \(c\) parameters: \(Beta(a, b)\). Input is a vector of length two with parameters \(a\) and \(b\).

prior.d

Optional prior distribution for \(d\) parameters: \(Beta(a, b)\). Input is a vector of length two with parameters \(a\) and \(b\).

est.K

Vector of integers which indicate which \(K\) parameters should be estimated. Equal integers correspond to the same estimated parameters.

min.K

Minimal \(K\) parameter to be estimated

max.K

Maximal \(K\) parameter to be estimated

min.delta

Minimal \(delta.miss\) parameter to be estimated

max.delta

Maximal \(delta.miss\) parameter to be estimated

beta.init

Optional vector of initial \(\beta\) parameters

min.beta

Minimum \(\beta\) parameter to be estimated.

pid

Optional vector of person identifiers

trait.weights

Optional vector of trait weights for a fixing the trait distribution.

center.trait

Should the trait distribution be centered

center.b

An optional logical indicating whether \(b\) parameters should be centered at each dimension

alpha1

Fixed or initial \(\alpha_1\) parameter

alpha2

Fixed or initial \(\alpha_2\) parameter

est.alpha

Should \(\alpha\) parameters be estimated?

equal.alpha

Estimate \(\alpha\) parameters under the assumption \(\alpha_1=\alpha_2\)?

designmatrix

Design matrix for item difficulties \(b\) to estimate linear logistic test models

alpha.conv

Convergence criterion for \(\alpha\) parameter

numdiff.parm

Parameter for numerical differentiation

numdiff.alpha.parm

Parameter for numerical differentiation for \(\alpha\) parameter

distribution.trait

Assumed trait distribution. The default is the normal distribution ("normal"). Log-linear smoothing of the trait distribution is also possible ("smooth2", "smooth3" or "smooth4" for smoothing up to 2, 3 or 4 moments, respectively).

Qmatrix

The Q-matrix

variance.fixed

Matrix for fixing covariance matrix (See Examples)

variance.init

Optional initial covariance matrix

mu.fixed

Matrix for fixing mean vector (See Examples)

irtmodel

Specify estimable IRT models: raschtype (Rasch type model), ramsay.qm (Ramsay's quotient model), npirt (Nonparametric item response model). If npirt is used as the argument for irtmodel, the argument npformula specifies different item response functions in the R formula framework (like "y~I(theta^2)"; see Examples). For estimating the missing data item response model, use irtmodel='missing1'.

npformula

A string or a vector which contains R formula objects for specifying the item response function. For example, "y~theta" is the specification of the 2PL model (see Details). If irtmodel="npirt" and npformula is not specified, then an unrestricted item response functions on the grid of \(\theta\) values is estimated.

npirt.monotone

Should nonparametrically estimated item response functions be monotone? The default is TRUE. This function applies only to irtmodel='npirt' and npformula=NULL.

use.freqpatt

A logical if frequencies of pattern should be used or not. The default is is.null(group). This means that for single group analyses, frequency patterns are used but not for multiple groups. If data processing times are large, then use.freqpatt=FALSE is recommended.

delta.miss

Missingness parameter \(\delta\) quantifying the meaning of responding to an item between the two extremes of ignoring missing responses and setting all missing responses to incorrect

est.delta

Vector with indices indicating the \(\delta\) parameters to be estimated if irtmodel="missing1".

nimps

Number of imputed datasets of item responses

object

Object of class rasch.mml

x

Object of class rasch.mml

items

Vector of integer or item names which should be plotted

xlim

Specification for xlim in plot

main

Title of the plot

file

Optional file name for summary output

...

Further arguments to be passed

Details

The item response function of the generalized item response model (irtmodel="raschtype"; Stukel, 1988) can be written as $$P( X_{pi}=1 | \theta_{pd} )=c_i + (d_i - c_i ) g_{\alpha_1, \alpha_2} [ a_i ( \theta_{pd} - b_i ) ] $$ where \(g\) is the generalized logistic link function depending on parameters \(\alpha_1\) and \(\alpha_2\).

For the most important link functions the specifications are (Stukel, 1988):

logistic link function: \(\alpha_1=0\) and \(\alpha_2=0\)
probit link function: \(\alpha_1=0.165\) and \(\alpha_2=0.165\)
loglog link function: \(\alpha_1=-0.037\) and \(\alpha_2=0.62\)
cloglog link function: \(\alpha_1=0.62\) and \(\alpha_2=-0.037\)

See pgenlogis for exact transformation formulas of the mentioned link functions.

A \(D\)-dimensional model can also be specified but only allows for between item dimensionality (one item loads on one and only dimension). Setting \(c_i=0\), \(d_i=1\) and \(a_i=1\) for all items \(i\), an additive item response model $$P( X_{pi}=1 | \theta_p )=g_{\alpha_1, \alpha_2} ( \theta_p - b_i ) $$ is estimated.

Ramsay's quotient model (irtmodel="qm.ramsay") uses the item response function $$P( X_{pi}=1 | \theta_p )=\frac{ \exp(\theta_p / b_i)} { K_i + \exp (\theta_p / b_i )} $$

Quite general unidimensional item response models can be estimated in a nonparametric framework (irtmodel="npirt"). The response functions are a linear combination of transformed \(\theta\) values $$logit[ P( X_{pi}=1 | \theta_p ) ]=Y_\theta \beta $$ Where \(Y_\theta\) is a design matrix of \(\theta\) and \(\beta\) are item parameters to be estimated. The formula \(Y_\theta \beta\) can be specified in the R formula framework (see Example 3, Model 3c).

Pseudo-likelihood estimation can be conducted for fractional item response data as input (i.e. some item response \(x_{pi}\) do have values between 0 and 1). Then the pseudo-likelihood \(L_p\) for person \(p\) is defined as $$ L_p=\prod_i P_i ( \theta_p )^{x_{pi}} [1-P_i ( \theta_p )]^{(1-x_{pi})}$$ Note that for dichotomous responses this term corresponds to the ordinary likelihood. See Example 7.

A special two-dimensional missing data item response model (irtmodel="missing1") is implemented according to Mislevy and Wu (1996). Besides an unidimensional ability \(\theta_p\), an individual response propensity \(\xi_p\) is proposed. We define item responses \(X_{pi}\) and response indicators \(R_{pi}\) indicating whether item responses \(X_{pi}\) are observed or not. Denoting the logistic function by \(\Psi\), the item response model for ability is defined as $$ P( X_{pi}=1 | \theta_p, \xi_p )=P( X_{pi}=1 | \theta_p ) =\Psi( a_i (\theta_p - b_i ))$$ We also define a measurement model for response indicators \(R_{pi}\) which depends on the item response \(X_{pi}\) itself: $$P( R_{pi}=1 | X_{pi}=k, \theta_p, \xi_p )= P( R_{pi}=1 | X_{pi}=k, \xi_p )= \Psi \left[ \xi_p - \beta_i - k \delta _i \right] \quad \mbox{ for } \quad k=0,1$$ If \(\delta _i=0\), then the probability of responding to an item is independent of the incompletely observed item \(X_{pi}\) which is an item response model with nonignorable missings (Holman & Glas, 2005; see also Pohl, Graefe & Rose, 2014). If \(\delta _i\) is a large negative number (e.g. \(\delta=-100\)), then it follows \(P( R_{pi}=1 | X_{pi}=1, \theta_p, \xi_p )=1\) and as a consequence it holds that \(P(X_{pi}=1 | R_{pi}=0, \theta_p, \xi_p)=0\), which is equivalent to treating all missing item responses as incorrect. The missingness parameter \(\delta\) can be specified by the user and studied as a sensitivity analysis under different missing not at random assumptions or can be estimated by choosing est.delta=TRUE.

Value

A list with following entries

dat

Original data frame

item

Estimated item parameters in the generalized item response model

item2

Estimated item parameters for Ramsay's quotient model

trait.distr

Discretized ability distribution points and probabilities

mean.trait

Estimated mean vector

sd.trait

Estimated standard deviations

skewness.trait

Estimated skewnesses

deviance

Deviance

pjk

Estimated probabilities of item correct evaluated at theta.k

rprobs

Item response probabilities like in pjk, but slightly extended to accommodate all categories

person

Person parameter estimates: mode (MAP) and mean (EAP) of the posterior distribution

pid

Person identifier

ability.est.pattern

Response pattern estimates

f.qk.yi

Individual posterior distribution

f.yi.qk

Individual likelihood

fixed.a

Estimated \(a\) parameters

fixed.c

Estimated \(c\) parameters

G

Number of groups

alpha1

Estimated \(\alpha_1\) parameter in generalized logistic item response model

alpha2

Estimated \(\alpha_2\) parameter in generalized logistic item response model

se.b

Standard error of \(b\) parameter in generalized logistic model or Ramsay's quotient model

se.a

Standard error of \(a\) parameter in generalized logistic model

se.c

Standard error of \(c\) parameter in generalized logistic model

se.d

Standard error of \(d\) parameter in generalized logistic model

se.alpha

Standard error of \(\alpha\) parameter in generalized logistic model

se.K

Standard error of \(K\) parameter in Ramsay's quotient model

iter

Number of iterations

reliability

EAP reliability

irtmodel

Type of estimated item response model

D

Number of dimensions

mu

Mean vector (for multidimensional models)

Sigma.cov

Covariance matrix (for multdimensional models)

theta.k

Grid of discretized ability distributions

trait.weights

Fixed vector of probabilities for the ability distribution

pi.k

Trait distribution

ic

Information criteria

esttype

Estimation type: ll (Log-Likelihood), pseudoll (Pseudo-Log-Likelihood)

...

References

Holman, R., & Glas, C. A. (2005). Modelling non-ignorable missing-data mechanisms with item response theory models. British Journal of Mathematical and Statistical Psychology, 58(1), 1-17. doi:10.1348/000711005X47168

Loken, E., & Rulison, K. L. (2010). Estimation of a four-parameter item response theory model. British Journal of Mathematical and Statistical Psychology, 63(3), 509-525. doi:10.1348/000711009X474502

Mislevy, R. J., & Wu, P. K. (1996). Missing responses and IRT ability estimation: Omits, choice, time Limits, and adaptive testing. ETS Research Report ETS RR-96-30. Princeton, ETS. doi:10.1002/j.2333-8504.1996.tb01708.x

Pohl, S., Graefe, L., & Rose, N. (2014). Dealing with omitted and not-reached items in competence tests evaluating approaches accounting for missing responses in item response theory models. Educational and Psychological Measurement, 74(3), 423-452. doi:10.1177/0013164413504926

Ramsay, J. O. (1989). A comparison of three simple test theory models. Psychometrika, 54, 487-499. doi:10.1007/BF02294631

Rossi, N., Wang, X., & Ramsay, J. O. (2002). Nonparametric item response function estimates with the EM algorithm. Journal of Educational and Behavioral Statistics, 27(3), 291-317. doi:10.3102/10769986027003291

Stukel, T. A. (1988). Generalized logistic models. Journal of the American Statistical Association, 83(402), 426-431. doi:10.1080/01621459.1988.10478613

van der Maas, H. J. L., Molenaar, D., Maris, G., Kievit, R. A., & Borsboom, D. (2011). Cognitive psychology meets psychometric theory: On the relation between process models for decision making and latent variable models for individual differences. Psychological Review, 118(2), 339-356. doi: 10.1037/a0022749

Note

Multiple group estimation is not possible for Ramsay's quotient model and multdimensional models.

See also

Simulate the generalized logistic Rasch model with sim.raschtype.

Simulate Ramsay's quotient model with sim.qm.ramsay.

Simulate locally dependent item response data using sim.rasch.dep.

For an assessment of global model fit see modelfit.sirt.

See CDM::itemfit.sx2 for item fit statistics.

Examples

#############################################################################
# EXAMPLE 1: Reading dataset
#############################################################################

library(CDM)
data(data.read)
dat <- data.read
I <- ncol(dat) # number of items

# Rasch model
mod1 <- sirt::rasch.mml2( dat )
summary(mod1)
plot( mod1 )    # plot all items
# title 'Rasch model', display curves from -3 to 3 only for items 1, 5 and 8
plot(mod1, main="Rasch model Items 1, 5 and 8", xlim=c(-3,3), items=c(1,5,8) )

# Rasch model with constraints on item difficulties
# set item parameters of A1 and C3 equal to -2
constraints <- data.frame( c("A1","C3"), c(-2,-2) )
mod1a <- sirt::rasch.mml2( dat, constraints=constraints)
summary(mod1a)

# estimate equal item parameters for 1st and 11th item
est.b <- 1:I
est.b[11] <- 1
mod1b <- sirt::rasch.mml2( dat, est.b=est.b )
summary(mod1b)

# estimate Rasch model with skew trait distribution
mod1c <- sirt::rasch.mml2( dat, distribution.trait="smooth3")
summary(mod1c)

# 2PL model
mod2 <- sirt::rasch.mml2( dat, est.a=1:I )
summary(mod2)
plot(mod2)    # plot 2PL item response curves

# extract individual likelihood
llmod2 <- IRT.likelihood(mod2)
str(llmod2)

if (FALSE) {
library(CDM)
# model comparisons
CDM::IRT.compareModels(mod1, mod1c, mod2 )
anova(mod1,mod2)

# assess model fit
smod1 <- IRT.modelfit(mod1)
smod2 <- IRT.modelfit(mod2)
IRT.compareModels(smod1, smod2)

# set some bounds for a and b parameters
mod2a <- sirt::rasch.mml2( dat, est.a=1:I, min.a=.7, max.a=2, min.b=-2 )
summary(mod2a)

# 3PL model
mod3 <- sirt::rasch.mml2( dat, est.a=1:I, est.c=1:I,
              mmliter=400 # maximal 400 iterations
                 )
summary(mod3)

# 3PL model with fixed guessing paramters of .25 and equal slopes
mod4 <- sirt::rasch.mml2( dat, fixed.c=rep( .25, I )   )
summary(mod4)

# 3PL model with equal guessing paramters for all items
mod5 <- sirt::rasch.mml2( dat, est.c=rep(1, I )   )
summary(mod5)

# difficulty + guessing model
mod6 <- sirt::rasch.mml2( dat, est.c=1:I   )
summary(mod6)

# 4PL model
mod7 <- sirt::rasch.mml2( dat, est.a=1:I, est.c=1:I, est.d=1:I,
            min.d=.95, max.c=.25)
        # set minimal d and maximal c parameter to .95 and .25
summary(mod7)

# 4PL model with prior distributions
mod7b <- sirt::rasch.mml2( dat, est.a=1:I, est.c=1:I, est.d=1:I, prior.a=c(1,2),
            prior.c=c(5,17), prior.d=c(20,2) )
summary(mod7b)

# constrained 4PL model
# equal slope, guessing and slipping parameters
mod8 <- sirt::rasch.mml2( dat,est.c=rep(1,I), est.d=rep(1,I) )
summary(mod8)

# estimation of an item response model with an
# uniform theta distribution
theta.k <- seq( 0.01, .99, len=20 )
trait.weights <- rep( 1/length(theta.k), length(theta.k) )
mod9 <- sirt::rasch.mml2( dat, theta.k=theta.k, trait.weights=trait.weights,
              normal.trait=FALSE, est.a=1:12  )
summary(mod9)

#############################################################################
# EXAMPLE 2: Longitudinal data
#############################################################################

data(data.long)
dat <- data.long[,-1]

# define Q loading matrix
Qmatrix <- matrix( 0, 12, 2 )
Qmatrix[1:6,1] <- 1 # T1 items
Qmatrix[7:12,2] <- 1    # T2 items

# define restrictions on item difficulties
est.b <- c(1,2,3,4,5,6,   3,4,5,6,7,8)
mu.fixed <- cbind(1,0)
    # set first mean to 0 for identification reasons

# Model 1: 2-dimensional Rasch model
mod1 <- sirt::rasch.mml2( dat, Qmatrix=Qmatrix, miterstep=4,
            est.b=est.b,  mu.fixed=mu.fixed, mmliter=30 )
summary(mod1)
plot(mod1)
##     Plot function is only applicable for unidimensional models
}

#############################################################################
# EXAMPLE 3: One group, estimation of alpha parameter in the generalized logistic model
#############################################################################

# simulate theta values
set.seed(786)
N <- 1000                  # number of persons
theta <- stats::rnorm( N, sd=1.5 ) # N persons with SD 1.5
b <- seq( -2, 2, len=15)

# simulate data
dat <- sirt::sim.raschtype( theta=theta, b=b, alpha1=0, alpha2=-0.3 )

#  estimating alpha parameters
mod1 <- sirt::rasch.mml2( dat, est.alpha=TRUE, mmliter=30 )
summary(mod1)
plot(mod1)

if (FALSE) {
# fixed alpha parameters
mod1b <- sirt::rasch.mml2( dat, est.alpha=FALSE, alpha1=0, alpha2=-.3 )
summary(mod1b)

# estimation with equal alpha parameters
mod1c <- sirt::rasch.mml2( dat, est.alpha=TRUE, equal.alpha=TRUE )
summary(mod1c)

# Ramsay QM
mod2a <- sirt::rasch.mml2( dat, irtmodel="ramsay.qm" )
summary(mod2a)
}

# Ramsay QM with estimated K parameters
mod2b <- sirt::rasch.mml2( dat, irtmodel="ramsay.qm", est.K=1:15, mmliter=30)
summary(mod2b)
plot(mod2b)

if (FALSE) {
# nonparametric estimation of monotone item response curves
mod3a <- sirt::rasch.mml2( dat, irtmodel="npirt", mmliter=100,
            theta.k=seq( -3, 3, len=10) ) # evaluations at 10 theta grid points
# nonparametric ICC of first 4 items
round( t(mod3a$pjk)[1:4,], 3 )
summary(mod3a)
plot(mod3a)

# nonparametric IRT estimation without monotonicity assumption
mod3b <- sirt::rasch.mml2( dat, irtmodel="npirt", mmliter=10,
            theta.k=seq( -3, 3, len=10), npirt.monotone=FALSE)
plot(mod3b)

# B-Spline estimation of ICCs
library(splines)
mod3c <- sirt::rasch.mml2( dat, irtmodel="npirt",
             npformula="y~bs(theta,df=3)", theta.k=seq(-3,3,len=15) )
summary(mod3c)
round( t(mod3c$pjk)[1:6,], 3 )
plot(mod3c)

# estimation of quadratic item response functions: ~ theta + I( theta^2)
mod3d <- sirt::rasch.mml2( dat, irtmodel="npirt",
             npformula="y~theta + I(theta^2)" )
summary(mod3d)
plot(mod3d)

# estimation of a stepwise ICC function
# ICCs are constant on the theta domains: [-Inf,-1], [-1,1], [1,Inf]
mod3e <- sirt::rasch.mml2( dat, irtmodel="npirt",
             npformula="y~I(theta>-1 )+I(theta>1)" )
summary(mod3e)
plot(mod3e, xlim=c(-2.5,2.5) )

# 2PL model
mod4 <- sirt::rasch.mml2( dat,  est.a=1:15)
summary(mod4)

#############################################################################
# EXAMPLE 4: Two groups, estimation of generalized logistic model
#############################################################################

# simulate generalized logistic Rasch model in two groups
set.seed(8765)
N1 <- 1000     # N1=1000 persons in group 1
N2 <- 500      # N2=500 persons in group 2
dat1 <- sirt::sim.raschtype( theta=stats::rnorm( N1, sd=1.5 ), b=b,
            alpha1=-0.3, alpha2=0)
dat2 <- sirt::sim.raschtype( theta=stats::rnorm( N2, mean=-.5, sd=.75),
            b=b, alpha1=-0.3, alpha2=0)
dat1 <- rbind( dat1, dat2 )
group <- c( rep(1,N1), rep(2,N2))

mod1 <-  sirt::rasch.mml2( dat1, parm.conv=.0001, group=group, est.alpha=TRUE )
summary(mod1)

#############################################################################
# EXAMPLE 5: Multidimensional model
#############################################################################

#***
# (1) simulate data
set.seed(785)
library(mvtnorm)
N <- 500
theta <- mvtnorm::rmvnorm( N,mean=c(0,0), sigma=matrix( c(1.45,.5,.5,1.7), 2, 2 ))
I <- 10
# 10 items load on the first dimension
p1 <- stats::plogis( outer( theta[,1], seq( -2, 2, len=I ), "-" ) )
resp1 <- 1 * ( p1 > matrix( stats::runif( N*I ), nrow=N, ncol=I ) )
# 10 items load on the second dimension
p1 <- stats::plogis( outer( theta[,2], seq( -2, 2, len=I ), "-" ) )
resp2 <- 1 * ( p1 > matrix( stats::runif( N*I ), nrow=N, ncol=I ) )
#Combine the two sets of items into one response matrix
resp <- cbind(resp1,resp2)
colnames(resp) <- paste("I", 1:(2*I), sep="")
dat <- resp

# define Q-matrix
Qmatrix <- matrix( 0, 2*I, 2 )
Qmatrix[1:I,1] <- 1
Qmatrix[1:I+I,2] <- 1

#***
# (2) estimation of models
# 2-dimensional Rasch model
mod1 <- sirt::rasch.mml2( dat, Qmatrix=Qmatrix )
summary(mod1)

# 2-dimensional 2PL model
mod2 <- sirt::rasch.mml2( dat, Qmatrix=Qmatrix, est.a=1:(2*I) )
summary(mod2)

# estimation with some fixed variances and covariances
# set variance of 1st dimension to 1 and
#  covariance to zero
variance.fixed <- matrix( cbind(c(1,1), c(1,2), c(1,0)),
             byrow=FALSE, ncol=3 )
mod3 <- sirt::rasch.mml2( dat, Qmatrix=Qmatrix, variance.fixed=variance.fixed )
summary(mod3)

# constraints on item difficulties
#  useful for example in longitudinal linking
est.b <- c( 1:I, 1:I )
    # equal indices correspond to equally estimated item parameters
mu.fixed <- cbind( 1, 0 )
mod4 <- sirt::rasch.mml2( dat, Qmatrix=Qmatrix, est.b=est.b, mu.fixed=mu.fixed )
summary(mod4)

#############################################################################
# EXAMPLE 6: Two booklets with same items but with item context effects.
# Therefore, item slopes and item difficulties are assumed to be shifted in the
# second design group.
#############################################################################

#***
# simulate data
set.seed(987)
I <- 10     # number of items
# define person design groups 1 and 2
n1 <- 700
n2 <- 1500
# item difficulties group 1
b1 <- seq(-1.5,1.5,length=I)
# item slopes group 1
a1 <- rep(1, I)
# simulate data group 1
dat1 <- sirt::sim.raschtype( stats::rnorm(n1), b=b1, fixed.a=a1 )
colnames(dat1) <- paste0("I", 1:I, "des1" )
# group 2
b2 <- b1 - .15
a2 <- 1.1*a1
# Item parameters are slightly transformed in the second group
# compared to the first group. This indicates possible item context effects.

# simulate data group 2
dat2 <- sirt::sim.raschtype( stats::rnorm(n2), b=b2, fixed.a=a2 )
colnames(dat2) <- paste0("I", 1:I, "des2" )
# define joint dataset
dat <- matrix( NA, nrow=n1+n2, ncol=2*I)
colnames(dat) <- c( colnames(dat1), colnames(dat2) )
dat[ 1:n1, 1:I ] <- dat1
dat[ n1 + 1:n2, I + 1:I ] <- dat2
# define group identifier
group <- c( rep(1,n1), rep(2,n2) )

#***
# Model 1: Rasch model two groups
itemindex <- rep( 1:I, 2 )
mod1 <- sirt::rasch.mml2( dat, group=group, est.b=itemindex )
summary(mod1)

#***
# Model 2: two item slope groups and designmatrix for intercepts
designmatrix <- matrix( 0, 2*I, I+1)
designmatrix[ ( 1:I )+ I,1:I] <- designmatrix[1:I,1:I] <- diag(I)
designmatrix[ ( 1:I )+ I,I+1] <- 1
mod2 <- sirt::rasch.mml2( dat, est.a=rep(1:2,each=I), designmatrix=designmatrix )
summary(mod2)

#############################################################################
# EXAMPLE 7: PIRLS dataset with missing responses
#############################################################################

data(data.pirlsmissing)
items <- grep( "R31", colnames(data.pirlsmissing), value=TRUE )
I <- length(items)
dat <- data.pirlsmissing

#****
# Model 1: recode missing responses as missing (missing are ignorable)

# data recoding
dat1 <- dat
dat1[ dat1==9 ] <- NA
# estimate Rasch model
mod1 <- sirt::rasch.mml2( dat1[,items], weights=dat$studwgt, group=dat$country )
summary(mod1)
##   Mean=0 0.341 -0.134 0.219
##   SD=1.142 1.166 1.197 0.959

#****
# Model 2: recode missing responses as wrong

# data recoding
dat2 <- dat
dat2[ dat2==9 ] <- 0
# estimate Rasch model
mod2 <- sirt::rasch.mml2( dat2[,items], weights=dat$studwgt, group=dat$country )
summary(mod2)
  ##   Mean=0 0.413 -0.172 0.446
  ##   SD=1.199 1.263 1.32 0.996

#****
# Model 3: recode missing responses as rho * P_i( theta ) and
#          apply pseudo-log-likelihood estimation
# Missing item responses are predicted by the model implied probability
# P_i( theta ) where theta is the ability estimate when ignoring missings (Model 1)
# and rho is an adjustment parameter. rho=0 is equivalent to Model 2 (treating
# missing as wrong) and rho=1 is equivalent to Model 1 (treating missing as ignorable).

# data recoding
dat3 <- dat
# simulate theta estimate from posterior distribution
theta <- stats::rnorm( nrow(dat3), mean=mod1$person$EAP, sd=mod1$person$SE.EAP )
rho <- .3   # define a rho parameter value of .3
for (ii in items){
    ind <- which( dat[,ii]==9 )
    dat3[ind,ii] <- rho*stats::plogis( theta[ind] - mod1$item$b[ which( items==ii ) ] )
                }

# estimate Rasch model
mod3 <- sirt::rasch.mml2( dat3[,items], weights=dat$studwgt, group=dat$country )
summary(mod3)
  ##   Mean=0 0.392 -0.153 0.38
  ##   SD=1.154 1.209 1.246 0.973

#****
# Model 4: simulate missing responses as rho * P_i( theta )
# The definition is the same as in Model 3. But it is now assumed
# that the missing responses are 'latent responses'.
set.seed(789)

# data recoding
dat4 <- dat
# simulate theta estimate from posterior distribution
theta <- stats::rnorm( nrow(dat4), mean=mod1$person$EAP, sd=mod1$person$SE.EAP )
rho <- .3   # define a rho parameter value of .3
for (ii in items){
    ind <- which( dat[,ii]==9 )
    p3 <- rho*stats::plogis( theta[ind] - mod1$item$b[ which( items==ii ) ] )
    dat4[ ind, ii ] <- 1*( stats::runif( length(ind), 0, 1 ) < p3)
                }

# estimate Rasch model
mod4 <- sirt::rasch.mml2( dat4[,items], weights=dat$studwgt, group=dat$country )
summary(mod4)
  ##   Mean=0 0.396 -0.156 0.382
  ##   SD=1.16 1.216 1.253 0.979

#****
# Model 5: recode missing responses for multiple choice items with four alternatives
#          to 1/4 and apply pseudo-log-likelihood estimation.
#          Missings for constructed response items are treated as incorrect.

# data recoding
dat5 <- dat
items_mc <- items[ substring( items, 7,7)=="M" ]
items_cr <- items[ substring( items, 7,7)=="C" ]
for (ii in items_mc){
    ind <- which( dat[,ii]==9 )
    dat5[ind,ii] <- 1/4
                }
for (ii in items_cr){
    ind <- which( dat[,ii]==9 )
    dat5[ind,ii] <- 0
                }

# estimate Rasch model
mod5 <- sirt::rasch.mml2( dat5[,items], weights=dat$studwgt, group=dat$country )
summary(mod5)
  ##   Mean=0 0.411 -0.165 0.435
  ##   SD=1.19 1.245 1.293 0.995

#*** For the following analyses, we ignore sample weights and the
#    country grouping.
data(data.pirlsmissing)
items <- grep( "R31", colnames(data.pirlsmissing), value=TRUE )
dat <- data.pirlsmissing
dat1 <- dat
dat1[ dat1==9 ] <- 0

#*** Model 6: estimate item difficulties assuming incorrect missing data treatment
mod6 <- sirt::rasch.mml2( dat1[,items], mmliter=50 )
summary(mod6)

#*** Model 7: reestimate model with constrained item difficulties
I <- length(items)
constraints <- cbind( 1:I, mod6$item$b )
mod7 <- sirt::rasch.mml2( dat1[,items], constraints=constraints)
summary(mod7)

#*** Model 8: score all missings responses as missing items
dat2 <- dat[,items]
dat2[ dat2==9 ] <- NA
mod8 <- sirt::rasch.mml2( dat2, constraints=constraints, mu.fixed=NULL )
summary(mod8)

#*** Model 9: estimate missing data model 'missing1' assuming a missingness
#       parameter delta.miss of zero
dat2 <-  dat[,items]    # note that missing item responses must be defined by 9
mod9 <- sirt::rasch.mml2( dat2, constraints=constraints, irtmodel="missing1",
            theta.k=seq(-5,5,len=10), delta.miss=0, mitermax=4, mu.fixed=NULL )
summary(mod9)

#*** Model 10: estimate missing data model with a large negative missing delta parameter
#=> This model is equivalent to treating missing responses as wrong
mod10 <- sirt::rasch.mml2( dat2, constraints=constraints, irtmodel="missing1",
             theta.k=seq(-5, 5, len=10), delta.miss=-10, mitermax=4, mmliter=200,
             mu.fixed=NULL )
summary(mod10)

#*** Model 11: choose a missingness delta parameter of -1
mod11 <- sirt::rasch.mml2( dat2, constraints=constraints, irtmodel="missing1",
             theta.k=seq(-5, 5, len=10), delta.miss=-1, mitermax=4,
             mmliter=200, mu.fixed=NULL )
summary(mod11)

#*** Model 12: estimate joint delta parameter
mod12 <- sirt::rasch.mml2( dat2, irtmodel="missing1", mu.fixed=cbind( c(1,2), 0 ),
             theta.k=seq(-8, 8, len=10), delta.miss=0, mitermax=4,
             mmliter=30, est.delta=rep(1,I)  )
summary(mod12)

#*** Model 13: estimate delta parameter in item groups defined by item format
est.delta <- 1 + 1 * ( substring( colnames(dat2),7,7 )=="M" )
mod13 <- sirt::rasch.mml2( dat2, irtmodel="missing1", mu.fixed=cbind( c(1,2), 0 ),
             theta.k=seq(-8, 8, len=10), delta.miss=0, mitermax=4,
             mmliter=30, est.delta=est.delta  )
summary(mod13)

#*** Model 14: estimate item specific delta parameter
mod14 <- sirt::rasch.mml2( dat2, irtmodel="missing1", mu.fixed=cbind( c(1,2), 0 ),
             theta.k=seq(-8, 8, len=10), delta.miss=0, mitermax=4,
             mmliter=30, est.delta=1:I  )
summary(mod14)

#############################################################################
# EXAMPLE 8: Comparison of different models for polytomous data
#############################################################################

data(data.Students, package="CDM")
head(data.Students)
dat <- data.Students[, paste0("act",1:5) ]
I <- ncol(dat)

#**************************************************
#*** Model 1: Partial Credit Model (PCM)

#*** Model 1a: PCM in TAM
mod1a <- TAM::tam.mml( dat )
summary(mod1a)

#*** Model 1b: PCM in sirt
mod1b <- sirt::rm.facets( dat )
summary(mod1b)

#*** Model 1c: PCM in mirt
mod1c <- mirt::mirt( dat, 1, itemtype=rep("Rasch",I), verbose=TRUE )
print(mod1c)

#**************************************************
#*** Model 2: Sequential Model (SM): Equal Loadings

#*** Model 2a: SM in sirt
dat1 <- CDM::sequential.items(dat)
resp <- dat1$dat.expand
iteminfo <- dat1$iteminfo
# fit model
mod2a <- sirt::rasch.mml2( resp )
summary(mod2a)

#**************************************************
#*** Model 3: Sequential Model (SM): Different Loadings

#*** Model 3a: SM in sirt
mod3a <- sirt::rasch.mml2( resp, est.a=iteminfo$itemindex )
summary(mod3a)

#**************************************************
#*** Model 4: Generalized partial credit model (GPCM)

#*** Model 4a: GPCM in TAM
mod4a <- TAM::tam.mml.2pl( dat, irtmodel="GPCM")
summary(mod4a)

#**************************************************
#*** Model 5: Graded response model (GRM)

#*** Model 5a: GRM in mirt
mod5a <- mirt::mirt( dat, 1, itemtype=rep("graded",I), verbose=TRUE)
print(mod5a)

# model comparison
logLik(mod1a);logLik(mod1b);mod1c@logLik  # PCM
logLik(mod2a)   # SM (Rasch)
logLik(mod3a)   # SM (GPCM)
logLik(mod4a)   # GPCM
mod5a@logLik    # GRM
}