NOHARM Model in R
noharm.sirt.Rd
The function is an R implementation of the normal ogive harmonic analysis robust method (the NOHARM model; McDonald, 1997). Exploratory and confirmatory multidimensional item response models for dichotomous data using the probit link function can be estimated. Lower asymptotes (guessing parameters) and upper asymptotes (one minus slipping parameters) can be provided as fixed values.
Usage
noharm.sirt(dat, pm=NULL, N=NULL, weights=NULL, Fval=NULL, Fpatt=NULL, Pval=NULL,
Ppatt=NULL, Psival=NULL, Psipatt=NULL, dimensions=NULL, lower=0, upper=1, wgtm=NULL,
pos.loading=FALSE, pos.variance=FALSE, pos.residcorr=FALSE, maxiter=1000, conv=1e-6,
optimizer="nlminb", par_lower=NULL, reliability=FALSE, ...)
# S3 method for noharm.sirt
summary(object, file=NULL, ...)
Arguments
- dat
Matrix of dichotomous item responses. This matrix may contain missing data (indicated by
NA
) but missingness is assumed to be missing completely at random (MCAR). Alternatively, a product-moment matrixpm
can be used as input.- pm
Optional product-moment matrix
- N
Sample size if
pm
is provided- weights
Optional vector of student weights.
- Fval
Initial or fixed values of the loading matrix \(\bold{F}\).
- Fpatt
Pattern matrix of the loading matrix \(\bold{F}\). If elements should be estimated, then an entry of 1 must be included in the pattern matrix. Parameters which should be estimated with equality constraints must be indicated by same integers but values largers than 1.
- Pval
Initial or fixed values for the covariance matrix \(\bold{P}\).
- Ppatt
Pattern matrix for the covariance matrix \(\bold{P}\).
- Psival
Initial or fixed values for the matrix of residual correlations \(\bold{\Psi}\).
- Psipatt
Pattern matrix for the matrix of residual correlations \(\bold{\Psi}\).
- dimensions
Number of dimensions if an exploratory factor analysis should be estimated.
- lower
Fixed vector (or numeric) of lower asymptotes \(c_i\).
- upper
Fixed vector (or numeric) of upper asymptotes \(d_i\).
- wgtm
Matrix with positive entries which indicates by a positive entry which item pairs should be used for estimation.
- pos.loading
An optional logical indicating whether all entries in the loading matrix \(\bold{F}\) should be positive
- pos.variance
An optional logical indicating whether all variances (i.e. diagonal entries in \(\bold{P}\)) should be positive
- pos.residcorr
An optional logical indicating whether all entries in the matrix of residual correlations \(\bold{\Psi}\) should be positive
- par_lower
Optional vector of lower parameter bounds
- maxiter
Maximum number of iterations
- conv
Convergence criterion for parameters
- optimizer
Optimization function to be used. Can be
"nlminb"
forstats::nlminb
or"optim"
forstats::optim
.- reliability
Logical indicating whether reliability should be computed.
- ...
Further arguments to be passed.
- object
Object of class
noharm.sirt
- file
String indicating a file name for summary.
Details
The NOHARM item response model follows the response equation $$P( X_{pi}=1 | \bold{\theta}_p )=c_i + ( d_i - c_i ) \Phi( f_{i0} + f_{i1} \theta_{p1} + ... + f_{iD} \theta_{pD} ) $$ for item responses \(X_{pi}\) of person \(p\) on item \(i\), \(\bold{F}=(f_{id})\) is a loading matrix and \(\bold{P}\) the covariance matrix of \(\bold{\theta}_p\). The lower asymptotes \(c_i\) and upper asymptotes \(d_i\) must be provided as fixed values. The response equation can be equivalently written by introducing a latent continuous item response \(X_{pi}^\ast\) $$ X_{pi}^\ast=f_{i0} + f_{i1} \theta_{p1} + ... + f_{iD} \theta_{pD} + e_{pi} $$ with a standard normally distributed residual \(e_{pi}\). These residuals have a correlation matrix \(\bold{\Psi}\) with ones in the diagonal. In this R implementation of the NOHARM model, correlations between residuals are allowed.
The estimation relies on a Hermite series approximation of the normal ogive item response functions. In more detail, a series expansion $$\Phi(x)=b_0 + b_1 H_1(x) + b_2 H_2(x) + b_3 H_3(x)$$ is used (McDonald, 1982a). This enables to express cross products \(p_{ij}=P(X_i=1, X_j=1)\) as a function of unknown model parameters $$\hat{p}_{ij}=b_{0i} b_{0j} + \sum_{m=1}^3 b_{mi} b_{mj} \left( \frac{\bold{f}_i \bold{P} \bold{f}_j }{\sqrt{ (1+\bold{f}_i \bold{P} \bold{f}_i) (1+\bold{f}_j \bold{P} \bold{f}_j)}} \right) ^m $$ where \(b_{0i}=p_{i}=P(X_i=1)=c_i + (d_i - c_i) \Phi(\tau_i)\), \(b_{1i}=(d_i-c_i)\phi(\tau_i)\), \(b_{2i}=(d_i-c_i)\tau_i \phi(\tau_i) / \sqrt{2}\), and \(b_{3i}=(d_i-c_i)(\tau_i^2 - 1)\phi(\tau_i) / \sqrt{6}\).
The least squares criterion \(\sum_{i<j} ( p_{ij} - \hat{p}_{ij})^2\) is used for estimating unknown model parameters (McDonald, 1982a, 1982b, 1997).
For derivations of standard errors and fit statistics see Maydeu-Olivares (2001) and Swaminathan and Rogers (2016).
For the statistical properties of the NOHARM approach see Knol and Berger (1991), Finch (2011) or Svetina and Levy (2016).
Value
A list. The most important entries are
- tanaka
Tanaka fit statistic
- rmsr
RMSR fit statistic
- N.itempair
Sample size per item pair
- pm
Product moment matrix
- wgtm
Matrix of weights for each item pair
- sumwgtm
Sum of lower triangle matrix
wgtm
- lower
Lower asymptotes
- upper
Upper asymptotes
- residuals
Residual matrix from approximation of the
pm
matrix- final.constants
Final constants
- factor.cor
Covariance matrix
- thresholds
Threshold parameters
- uniquenesses
Uniquenesses
- loadings
Matrix of standardized factor loadings (delta parametrization)
- loadings.theta
Matrix of factor loadings \(\bold{F}\) (theta parametrization)
- residcorr
Matrix of residual correlations
- Nobs
Number of observations
- Nitems
Number of items
- Fpatt
Pattern loading matrix for \(\bold{F}\)
- Ppatt
Pattern loading matrix for \(\bold{P}\)
- Psipatt
Pattern loading matrix for \(\bold{\Psi}\)
- dat
Used dataset
- dimensions
Number of dimensions
- iter
Number of iterations
- Nestpars
Number of estimated parameters
- chisquare
Statistic \(\chi^2\)
- df
Degrees of freedom
- chisquare_df
Ratio \(\chi^2 / df\)
- rmsea
RMSEA statistic
- p.chisquare
Significance for \(\chi^2\) statistic
- omega.rel
Reliability of the sum score according to Green and Yang (2009)
References
Finch, H. (2011). Multidimensional item response theory parameter estimation with nonsimple structure items. Applied Psychological Measurement, 35(1), 67-82. doi:10.1177/0146621610367787
Fraser, C., & McDonald, R. P. (1988). NOHARM: Least squares item factor analysis. Multivariate Behavioral Research, 23, 267-269. doi:10.1207/s15327906mbr2302_9
Fraser, C., & McDonald, R. P. (2012). NOHARM 4 Manual.
http://noharm.niagararesearch.ca/nh4man/nhman.html.
Knol, D. L., & Berger, M. P. (1991). Empirical comparison between factor analysis and multidimensional item response models. Multivariate Behavioral Research, 26(3), 457-477. doi:10.1207/s15327906mbr2603_5
Maydeu-Olivares, A. (2001). Multidimensional item response theory modeling of binary data: Large sample properties of NOHARM estimates. Journal of Educational and Behavioral Statistics, 26(1), 51-71. doi:10.3102/10769986026001051
McDonald, R. P. (1982a). Linear versus nonlinear models in item response theory. Applied Psychological Measurement, 6(4), 379-396. doi:10.1177/014662168200600402
McDonald, R. P. (1982b). Unidimensional and multidimensional models for item response theory. I.R.T., C.A.T. conference, Minneapolis, 1982, Proceedings.
McDonald, R. P. (1997). Normal-ogive multidimensional model. In W. van der Linden & R. K. Hambleton (1997): Handbook of modern item response theory (pp. 257-269). New York: Springer. doi:10.1007/978-1-4757-2691-6
Svetina, D., & Levy, R. (2016). Dimensionality in compensatory MIRT when complex structure exists: Evaluation of DETECT and NOHARM. The Journal of Experimental Education, 84(2), 398-420. doi:10.1080/00220973.2015.1048845
Swaminathan, H., & Rogers, H. J. (2016). Normal-ogive multidimensional models. In W. J. van der Linden (Ed.). Handbook of item response theory. Volume One: Models (pp. 167-187). Boca Raton: CRC Press. doi:10.1201/9781315374512
See also
EAP person parameter estimates can be obtained by R2noharm.EAP
.
Model fit can be assessed by modelfit.sirt
.
See R2noharm
for running the NOHARM software from within R.
See Fraser and McDonald (1988, 2012) for an implementation of the NOHARM model which is available as freeware (http://noharm.niagararesearch.ca/; the link seems to be broken in the meanwhile).
Examples
#############################################################################
# EXAMPLE 1: Two-dimensional IRT model with 10 items
#############################################################################
#**** data simulation
set.seed(9776)
N <- 3400 # sample size
# define difficulties
f0 <- c( .5, .25, -.25, -.5, 0, -.5, -.25, .25, .5, 0 )
I <- length(f0)
# define loadings
f1 <- matrix( 0, I, 2 )
f1[ 1:5,1] <- c(.8,.7,.6,.5, .5)
f1[ 6:10,2] <- c(.8,.7,.6,.5, .5 )
# covariance matrix
Pval <- matrix( c(1,.5,.5,1), 2, 2 )
# simulate theta
library(mvtnorm)
theta <- mvtnorm::rmvnorm(N, mean=c(0,0), sigma=Pval )
# simulate item responses
dat <- matrix( NA, N, I )
for (ii in 1:I){ # ii <- 1
dat[,ii] <- 1*( stats::pnorm(f0[ii]+theta[,1]*f1[ii,1]+theta[,2]*f1[ii,2])>
stats::runif(N) )
}
colnames(dat) <- paste0("I", 1:I)
#**** Model 1: Two-dimensional CFA with estimated item loadings
# define pattern matrices
Pval <- .3+0*Pval
Ppatt <- 1*(Pval>0)
diag(Ppatt) <- 0
diag(Pval) <- 1
Fval <- .7 * ( f1>0)
Fpatt <- 1 * ( Fval > 0 )
# estimate model
mod1 <- sirt::noharm.sirt( dat=dat, Ppatt=Ppatt, Fpatt=Fpatt, Fval=Fval, Pval=Pval )
summary(mod1)
# EAP ability estimates
pmod1 <- sirt::R2noharm.EAP(mod1, theta.k=seq(-4,4,len=10) )
# model fit
summary( sirt::modelfit.sirt(mod1) )
if (FALSE) {
#*** compare results with NOHARM software
noharm.path <- "c:/NOHARM" # specify path for noharm software
mod1a <- sirt::R2noharm( dat=dat, model.type="CFA", F.pattern=Fpatt, F.init=Fval,
P.pattern=Ppatt, P.init=Pval, writename="r2noharm_example",
noharm.path=noharm.path, dec="," )
summary(mod1a)
#**** Model 1c: put some equality constraints
Fpatt[ c(1,4),1] <- 3
Fpatt[ cbind( c(3,7), c(1,2)) ] <- 4
mod1c <- sirt::noharm.sirt( dat=dat, Ppatt=Ppatt, Fpatt=Fpatt, Fval=Fval, Pval=Pval)
summary(mod1c)
#**** Model 2: Two-dimensional CFA with correlated residuals
# define pattern matrix for residual correlation
Psipatt <- 0*diag(I)
Psipatt[1,2] <- 1
Psival <- 0*Psipatt
# estimate model
mod2 <- sirt::noharm.sirt( dat=dat, Ppatt=Ppatt,Fpatt=Fpatt, Fval=Fval, Pval=Pval,
Psival=Psival, Psipatt=Psipatt )
summary(mod2)
#**** Model 3: Two-dimensional Rasch model
# pattern matrices
Fval <- matrix(0,10,2)
Fval[1:5,1] <- Fval[6:10,2] <- 1
Fpatt <- 0*Fval
Ppatt <- Pval <- matrix(1,2,2)
Pval[1,2] <- Pval[2,1] <- 0
# estimate model
mod3 <- sirt::noharm.sirt( dat=dat, Ppatt=Ppatt,Fpatt=Fpatt, Fval=Fval, Pval=Pval )
summary(mod3)
# model fit
summary( sirt::modelfit.sirt( mod3 ))
#** compare fit with NOHARM
noharm.path <- "c:/NOHARM"
P.pattern <- Ppatt ; P.init <- Pval
F.pattern <- Fpatt ; F.init <- Fval
mod3b <- sirt::R2noharm( dat=dat, model.type="CFA",
F.pattern=F.pattern, F.init=F.init, P.pattern=P.pattern,
P.init=P.init, writename="example_sim_2dim_rasch",
noharm.path=noharm.path, dec="," )
summary(mod3b)
#############################################################################
# EXAMPLE 2: data.read
#############################################################################
data(data.read)
dat <- data.read
I <- ncol(dat)
#**** Model 1: Unidimensional Rasch model
Fpatt <- matrix( 0, I, 1 )
Fval <- 1 + 0*Fpatt
Ppatt <- Pval <- matrix(1,1,1)
# estimate model
mod1 <- sirt::noharm.sirt( dat=dat, Ppatt=Ppatt,Fpatt=Fpatt, Fval=Fval, Pval=Pval )
summary(mod1)
plot(mod1) # semPaths plot
#**** Model 2: Rasch model in which item pairs within a testlet are excluded
wgtm <- matrix( 1, I, I )
wgtm[1:4,1:4] <- wgtm[5:8,5:8] <- wgtm[ 9:12, 9:12] <- 0
# estimation
mod2 <- sirt::noharm.sirt(dat=dat, Ppatt=Ppatt,Fpatt=Fpatt, Fval=Fval, Pval=Pval, wgtm=wgtm)
summary(mod2)
#**** Model 3: Rasch model with correlated residuals
Psipatt <- Psival <- 0*diag(I)
Psipatt[1:4,1:4] <- Psipatt[5:8,5:8] <- Psipatt[ 9:12, 9:12] <- 1
diag(Psipatt) <- 0
Psival <- .6*(Psipatt>0)
# estimation
mod3 <- sirt::noharm.sirt( dat=dat, Ppatt=Ppatt,Fpatt=Fpatt, Fval=Fval, Pval=Pval,
Psival=Psival, Psipatt=Psipatt )
summary(mod3)
# allow only positive residual correlations
mod3b <- sirt::noharm.sirt( dat=dat, Ppatt=Ppatt, Fpatt=Fpatt, Fval=Fval, Pval=Pval,
Psival=Psival, Psipatt=Psipatt, pos.residcorr=TRUE)
summary(mod3b)
#* constrain residual correlations
Psipatt[1:4,1:4] <- 2
Psipatt[5:8,5:8] <- 3
Psipatt[ 9:12, 9:12] <- 4
mod3c <- sirt::noharm.sirt( dat=dat, Ppatt=Ppatt, Fpatt=Fpatt, Fval=Fval, Pval=Pval,
Psival=Psival, Psipatt=Psipatt, pos.residcorr=TRUE)
summary(mod3c)
#**** Model 4: Rasch testlet model
Fval <- Fpatt <- matrix( 0, I, 4 )
Fval[,1] <- Fval[1:4,2] <- Fval[5:8,3] <- Fval[9:12,4 ] <- 1
Ppatt <- Pval <- diag(4)
colnames(Ppatt) <- c("g", "A", "B","C")
Pval <- .5*Pval
# estimation
mod4 <- sirt::noharm.sirt( dat=dat, Ppatt=Ppatt,Fpatt=Fpatt, Fval=Fval, Pval=Pval )
summary(mod4)
# allow only positive variance entries
mod4b <- sirt::noharm.sirt( dat=dat, Ppatt=Ppatt,Fpatt=Fpatt, Fval=Fval, Pval=Pval,
pos.variance=TRUE )
summary(mod4b)
#**** Model 5: Bifactor model
Fval <- matrix( 0, I, 4 )
Fval[,1] <- Fval[1:4,2] <- Fval[5:8,3] <- Fval[9:12,4 ] <- .6
Fpatt <- 1 * ( Fval > 0 )
Pval <- diag(4)
Ppatt <- 0*Pval
colnames(Ppatt) <- c("g", "A", "B","C")
# estimation
mod5 <- sirt::noharm.sirt( dat=dat, Ppatt=Ppatt,Fpatt=Fpatt, Fval=Fval, Pval=Pval )
summary(mod5)
# allow only positive loadings
mod5b <- sirt::noharm.sirt( dat=dat, Ppatt=Ppatt,Fpatt=Fpatt, Fval=Fval, Pval=Pval,
pos.loading=TRUE )
summary(mod5b)
summary( sirt::modelfit.sirt(mod5b))
#**** Model 6: 3-dimensional Rasch model
Fval <- matrix( 0, I, 3 )
Fval[1:4,1] <- Fval[5:8,2] <- Fval[9:12,3 ] <- 1
Fpatt <- 0*Fval
Pval <- .6*diag(3)
diag(Pval) <- 1
Ppatt <- 1+0*Pval
colnames(Ppatt) <- c("A", "B","C")
# estimation
mod6 <- sirt::noharm.sirt( dat=dat, Ppatt=Ppatt,Fpatt=Fpatt, Fval=Fval, Pval=Pval )
summary(mod6)
summary( sirt::modelfit.sirt(mod6) ) # model fit
#**** Model 7: 3-dimensional 2PL model
Fval <- matrix( 0, I, 3 )
Fval[1:4,1] <- Fval[5:8,2] <- Fval[9:12,3 ] <- 1
Fpatt <- Fval
Pval <- .6*diag(3)
diag(Pval) <- 1
Ppatt <- 1+0*Pval
diag(Ppatt) <- 0
colnames(Ppatt) <- c("A", "B","C")
# estimation
mod7 <- sirt::noharm.sirt( dat=dat, Ppatt=Ppatt,Fpatt=Fpatt, Fval=Fval, Pval=Pval )
summary(mod7)
summary( sirt::modelfit.sirt(mod7) )
#**** Model 8: Exploratory factor analysis with 3 dimensions
# estimation
mod8 <- sirt::noharm.sirt( dat=dat, dimensions=3 )
summary(mod8)
#############################################################################
# EXAMPLE 3: Product-moment matrix input, McDonald (1997)
#############################################################################
# data from Table 1 of McDonald (1997, p. 266)
pm0 <- "
0.828
0.567 0.658
0.664 0.560 0.772
0.532 0.428 0.501 0.606
0.718 0.567 0.672 0.526 0.843
"
pm <- miceadds::string_to_matrix(x=pm0, as_numeric=TRUE, extend=TRUE)
I <- nrow(pm)
rownames(pm) <- colnames(pm) <- paste0("I", 1:I)
#- Model 1: Unidimensional model
Fval <- matrix(.7, nrow=I, ncol=1)
Fpatt <- 1+0*Fval
Pval <- matrix(1, nrow=1,ncol=1)
Ppatt <- 0*Pval
mod1 <- sirt::noharm.sirt(pm=pm, N=1000, Fval=Fval, Fpatt=Fpatt, Pval=Pval, Ppatt=Ppatt)
summary(mod1)
#- Model 2: Twodimensional exploratory model
mod2 <- sirt::noharm.sirt(pm=pm, N=1000, dimensions=2)
summary(mod2)
#- Model 3: Unidimensional model with correlated residuals
Psival <- matrix(0, nrow=I, ncol=I)
Psipatt <- 0*Psival
Psipatt[5,1] <- 1
mod3 <- sirt::noharm.sirt(pm=pm, N=1000, Fval=Fval, Fpatt=Fpatt, Pval=Pval, Ppatt=Ppatt,
Psival=Psival, Psipatt=Psipatt)
summary(mod3)
}