Hierarchical Rater Model Based on Signal Detection Theory (HRM-SDT)
rm.sdt.Rd
This function estimates a version of the hierarchical rater model (HRM) based on signal detection theory (HRM-SDT; DeCarlo, 2005; DeCarlo, Kim & Johnson, 2011; Robitzsch & Steinfeld, 2018). The model is estimated by means of an EM algorithm adapted from multilevel latent class analysis (Vermunt, 2008).
Usage
rm.sdt(dat, pid, rater, Qmatrix=NULL, theta.k=seq(-9, 9, len=30),
est.a.item=FALSE, est.c.rater="n", est.d.rater="n", est.mean=FALSE, est.sigma=TRUE,
skillspace="normal", tau.item.fixed=NULL, a.item.fixed=NULL,
d.min=0.5, d.max=100, d.start=3, c.start=NULL, tau.start=NULL, sd.start=1,
d.prior=c(3,100), c.prior=c(3,100), tau.prior=c(0,1000), a.prior=c(1,100),
link_item="GPCM", max.increment=1, numdiff.parm=0.00001, maxdevchange=0.1,
globconv=.001, maxiter=1000, msteps=4, mstepconv=0.001, optimizer="nlminb" )
# S3 method for rm.sdt
summary(object, file=NULL, ...)
# S3 method for rm.sdt
plot(x, ask=TRUE, ...)
# S3 method for rm.sdt
anova(object,...)
# S3 method for rm.sdt
logLik(object,...)
# S3 method for rm.sdt
IRT.factor.scores(object, type="EAP", ...)
# S3 method for rm.sdt
IRT.irfprob(object,...)
# S3 method for rm.sdt
IRT.likelihood(object,...)
# S3 method for rm.sdt
IRT.posterior(object,...)
# S3 method for rm.sdt
IRT.modelfit(object,...)
# S3 method for IRT.modelfit.rm.sdt
summary(object,...)
Arguments
- dat
Original data frame. Ratings on variables must be in rows, i.e. every row corresponds to a person-rater combination.
- pid
Person identifier.
- rater
Rater identifier.
- Qmatrix
An optional Q-matrix. If this matrix is not provided, then by default the ordinary scoring of categories (from 0 to the maximum score of \(K\)) is used.
- theta.k
A grid of theta values for the ability distribution.
- est.a.item
Should item parameters \(a_i\) be estimated?
- est.c.rater
Type of estimation for item-rater parameters \(c_{ir}\) in the signal detection model. Options are
'n'
(no estimation),'e'
(set all parameters equal to each other),'i'
(itemwise estimation),'r'
(rater wise estimation) and'a'
(all parameters are estimated independently from each other).- est.d.rater
Type of estimation of \(d\) parameters. Options are the same as in
est.c.rater
.- est.mean
Optional logical indicating whether the mean of the trait distribution should be estimated.
- est.sigma
Optional logical indicating whether the standard deviation of the trait distribution should be estimated.
- skillspace
Specified \(\theta\) distribution type. It can be
"normal"
or"discrete"
. In the latter case, all probabilities of the distribution are separately estimated.- tau.item.fixed
Optional matrix with three columns specifying fixed \(\tau\) parameters. The first two columns denote item and category indices, the third the fixed value. See Example 3.
- a.item.fixed
Optional matrix with two columns specifying fixed \(a\) parameters. First column: Item index. Second column: Fixed \(a\) parameter.
- d.min
Minimal \(d\) parameter to be estimated
- d.max
Maximal \(d\) parameter to be estimated
- d.start
Starting value(s) of \(d\) parameters
- c.start
Starting values of \(c\) parameters
- tau.start
Starting values of \(\tau\) parameters
- sd.start
Starting value for trait standard deviation
- d.prior
Normal prior \(N(M,S^2)\) for \(d\) parameters
- c.prior
Normal prior for \(c\) parameters. The prior for parameter \(c_{irk}\) is defined as \(M \cdot ( k - 0.5) \) where \(M\) is
c.prior[1]
.- tau.prior
Normal prior for \(\tau\) parameters
- a.prior
Normal prior for \(a\) parameters
- link_item
Type of item response function for latent responses. Can be
"GPCM"
for the generalized partial credit model or"GRM"
for the graded response model.- max.increment
Maximum increment of item parameters during estimation
- numdiff.parm
Numerical differentiation step width
- maxdevchange
Maximum relative deviance change as a convergence criterion
- globconv
Maximum parameter change
- maxiter
Maximum number of iterations
- msteps
Maximum number of iterations during an M step
- mstepconv
Convergence criterion in an M step
- optimizer
Choice of optimization function in M-step for item parameters. Options are
"nlminb"
forstats::nlminb
and"optim"
forstats::optim
.- object
Object of class
rm.sdt
- file
Optional file name in which summary should be written.
- x
Object of class
rm.sdt
- ask
Optional logical indicating whether a new plot should be asked for.
- type
Factor score estimation method. Up to now, only
type="EAP"
is supported.- ...
Further arguments to be passed
Details
The specification of the model follows DeCarlo et al. (2011).
The second level models the ideal rating (latent response) \(\eta=0, ...,K\)
of person \(p\) on item \(i\). The option link_item='GPCM'
follows the
generalized partial credit model
$$ P( \eta_{pi}=\eta | \theta_p ) \propto
exp( a_{i} q_{i \eta } \theta_p - \tau_{i \eta } ) $$. The option link_item='GRM'
employs the
graded response model $$ P( \eta_{pi}=\eta | \theta_p )=
\Psi( \tau_{i,\eta + 1} - a_i \theta_p ) - \Psi( \tau_{i,\eta} - a_i \theta_p ) $$
At the first level, the ratings \(X_{pir}\) for person \(p\) on item \(i\) and rater \(r\) are modeled as a signal detection model $$ P( X_{pir} \le k | \eta_{pi} )= G( c_{irk} - d_{ir} \eta_{pi} )$$ where \(G\) is the logistic distribution function and the categories are \(k=1,\ldots, K+1\). Note that the item response model can be equivalently written as $$ P( X_{pir} \ge k | \eta_{pi} )= G( d_{ir} \eta_{pi} - c_{irk})$$
The thresholds \(c_{irk}\) can be further restricted to
\(c_{irk}=c_{k}\) (est.c.rater='e'
),
\(c_{irk}=c_{ik}\) (est.c.rater='i'
) or
\(c_{irk}=c_{ir}\) (est.c.rater='r'
). The same
holds for rater precision parameters \(d_{ir}\).
Value
A list with following entries:
- deviance
Deviance
- ic
Information criteria and number of parameters
- item
Data frame with item parameters. The columns
N
andM
denote the number of observed ratings and the observed mean of all ratings, respectively.
In addition to item parameters \(\tau_{ik}\) and \(a_i\), the mean for the latent response (latM
) is computed as \(E( \eta_i )=\sum_p P( \theta_p ) q_{ik} P( \eta_i=k | \theta_p ) \) which provides an item parameter at the original metric of ratings. The latent standard deviation (latSD
) is computed in the same manner.- rater
Data frame with rater parameters. Transformed \(c\) parameters (
c_x.trans
) are computed as \(c_{irk} / ( d_{ir} )\).- person
Data frame with person parameters: EAP and corresponding standard errors
- EAP.rel
EAP reliability
- EAP.rel
EAP reliability
- mu
Mean of the trait distribution
- sigma
Standard deviation of the trait distribution
- tau.item
Item parameters \(\tau_{ik}\)
- se.tau.item
Standard error of item parameters \(\tau_{ik}\)
- a.item
Item slopes \(a_i\)
- se.a.item
Standard error of item slopes \(a_i\)
- c.rater
Rater parameters \(c_{irk}\)
- se.c.rater
Standard error of rater severity parameter \(c_{irk}\)
- d.rater
Rater slope parameter \(d_{ir}\)
- se.d.rater
Standard error of rater slope parameter \(d_{ir}\)
- f.yi.qk
Individual likelihood
- f.qk.yi
Individual posterior distribution
- probs
Item probabilities at grid
theta.k
. Note that these probabilities are calculated on the pseudo items \(i \times r\), i.e. the interaction of item and rater.- prob.item
Probabilities \(P( \eta_i=\eta | \theta )\) of latent item responses evaluated at theta grid \(\theta_p\).
- n.ik
Expected counts
- pi.k
Estimated trait distribution \(P(\theta_p)\).
- maxK
Maximum number of categories
- procdata
Processed data
- iter
Number of iterations
- ...
Further values
References
DeCarlo, L. T. (2005). A model of rater behavior in essay grading based on signal detection theory. Journal of Educational Measurement, 42, 53-76.
DeCarlo, L. T. (2010). Studies of a latent-class signal-detection model for constructed response scoring II: Incomplete and hierarchical designs. ETS Research Report ETS RR-10-08. Princeton NJ: ETS.
DeCarlo, T., Kim, Y., & Johnson, M. S. (2011). A hierarchical rater model for constructed responses, with a signal detection rater model. Journal of Educational Measurement, 48, 333-356.
Robitzsch, A., & Steinfeld, J. (2018). Item response models for human ratings: Overview, estimation methods, and implementation in R. Psychological Test and Assessment Modeling, 60(1), 101-139.
Vermunt, J. K. (2008). Latent class and finite mixture models for multilevel data sets. Statistical Methods in Medical Research, 17, 33-51.
See also
The facets rater model can be estimated with rm.facets
.
Examples
#############################################################################
# EXAMPLE 1: Hierarchical rater model (HRM-SDT) data.ratings1
#############################################################################
data(data.ratings1)
dat <- data.ratings1
if (FALSE) {
# Model 1: Partial Credit Model: no rater effects
mod1 <- sirt::rm.sdt( dat[, paste0( "k",1:5) ], rater=dat$rater,
pid=dat$idstud, est.c.rater="n", d.start=100, est.d.rater="n" )
summary(mod1)
# Model 2: Generalized Partial Credit Model: no rater effects
mod2 <- sirt::rm.sdt( dat[, paste0( "k",1:5) ], rater=dat$rater,
pid=dat$idstud, est.c.rater="n", est.d.rater="n",
est.a.item=TRUE, d.start=100)
summary(mod2)
# Model 3: Equal effects in SDT
mod3 <- sirt::rm.sdt( dat[, paste0( "k",1:5) ], rater=dat$rater,
pid=dat$idstud, est.c.rater="e", est.d.rater="e")
summary(mod3)
# Model 4: Rater effects in SDT
mod4 <- sirt::rm.sdt( dat[, paste0( "k",1:5) ], rater=dat$rater,
pid=dat$idstud, est.c.rater="r", est.d.rater="r")
summary(mod4)
#############################################################################
# EXAMPLE 2: HRM-SDT data.ratings3
#############################################################################
data(data.ratings3)
dat <- data.ratings3
dat <- dat[ dat$rater < 814, ]
psych::describe(dat)
# Model 1: item- and rater-specific effects
mod1 <- sirt::rm.sdt( dat[, paste0( "crit",c(2:4)) ], rater=dat$rater,
pid=dat$idstud, est.c.rater="a", est.d.rater="a" )
summary(mod1)
plot(mod1)
# Model 2: Differing number of categories per variable
mod2 <- sirt::rm.sdt( dat[, paste0( "crit",c(2:4,6)) ], rater=dat$rater,
pid=dat$idstud, est.c.rater="a", est.d.rater="a")
summary(mod2)
plot(mod2)
#############################################################################
# EXAMPLE 3: Hierarchical rater model with discrete skill spaces
#############################################################################
data(data.ratings3)
dat <- data.ratings3
dat <- dat[ dat$rater < 814, ]
psych::describe(dat)
# Model 1: Discrete theta skill space with values of 0,1,2 and 3
mod1 <- sirt::rm.sdt( dat[, paste0( "crit",c(2:4)) ], theta.k=0:3, rater=dat$rater,
pid=dat$idstud, est.c.rater="a", est.d.rater="a", skillspace="discrete" )
summary(mod1)
plot(mod1)
# Model 2: Modelling of one item by using a discrete skill space and
# fixed item parameters
# fixed tau and a parameters
tau.item.fixed <- cbind( 1, 1:3, 100*cumsum( c( 0.5, 1.5, 2.5)) )
a.item.fixed <- cbind( 1, 100 )
# fit HRM-SDT
mod2 <- sirt::rm.sdt( dat[, "crit2", drop=FALSE], theta.k=0:3, rater=dat$rater,
tau.item.fixed=tau.item.fixed,a.item.fixed=a.item.fixed, pid=dat$idstud,
est.c.rater="a", est.d.rater="a", skillspace="discrete" )
summary(mod2)
plot(mod2)
}