TIMSS 2007 (Grade 4) dataset with 25 mathematics (dichotomized) items used in Lee, Park and Taylan (2011), Park and Lee (2014) and Park, Xing and Lee (2018). The dataset includes a sample of 698 Austrian students.

data(data.timss07.G4.lee)
data(data.timss07.G4.py)
data(data.timss07.G4.Qdomains)

Format

  • The dataset data.timss07.G4.lee is a list containing dichotomous item responses (data; information on booklet and gender included), the Q-matrix (q.matrix) and descriptions of the skills (skillinfo) used in Lee et al. (2011).

    The format is:

    List of 3
    $ data :'data.frame':
    ..$ idstud : int [1:698] 10110 10111 20105 20106 30203 30204 40106 40107 60111 60112 ...
    ..$ idbook : int [1:698] 4 5 4 5 4 5 4 5 4 5 ...
    ..$ girl : int [1:698] 0 0 1 1 0 1 0 1 1 1 ...
    ..$ M041052 : num [1:698] 1 NA 1 NA 0 NA 1 NA 1 NA ...
    ..$ M041056 : num [1:698] 1 NA 0 NA 0 NA 0 NA 1 NA ...
    ..$ M041069 : num [1:698] 0 NA 0 NA 0 NA 0 NA 1 NA ...
    ..$ M041076 : num [1:698] 1 NA 0 NA 1 NA 1 NA 0 NA ...
    ..$ M041281 : num [1:698] 1 NA 0 NA 1 NA 1 NA 0 NA ...
    ..$ M041164 : num [1:698] 1 NA 1 NA 0 NA 1 NA 1 NA ...
    ..$ M041146 : num [1:698] 0 NA 0 NA 1 NA 1 NA 0 NA ...
    ..$ M041152 : num [1:698] 1 NA 1 NA 1 NA 0 NA 1 NA ...
    ..$ M041258A: num [1:698] 0 NA 1 NA 1 NA 0 NA 1 NA ...
    ..$ M041258B: num [1:698] 1 NA 0 NA 1 NA 0 NA 1 NA ...
    ..$ M041131 : num [1:698] 0 NA 0 NA 1 NA 1 NA 1 NA ...
    ..$ M041275 : num [1:698] 1 NA 0 NA 0 NA 1 NA 1 NA ...
    ..$ M041186 : num [1:698] 1 NA 0 NA 1 NA 1 NA 0 NA ...
    ..$ M041336 : num [1:698] 1 NA 1 NA 0 NA 1 NA 0 NA ...
    ..$ M031303 : num [1:698] 1 1 0 1 0 1 1 1 0 0 ...
    ..$ M031309 : num [1:698] 1 0 1 1 1 1 1 1 0 0 ...
    ..$ M031245 : num [1:698] 0 0 0 0 0 0 0 0 0 0 ...
    ..$ M031242A: num [1:698] 1 1 0 1 1 1 1 1 0 0 ...
    ..$ M031242B: num [1:698] 0 1 0 1 1 1 1 1 1 0 ...
    ..$ M031242C: num [1:698] 1 1 0 1 1 1 1 1 1 0 ...
    ..$ M031247 : num [1:698] 0 0 0 0 0 0 0 0 0 0 ...
    ..$ M031219 : num [1:698] 1 1 1 0 1 1 1 1 1 0 ...
    ..$ M031173 : num [1:698] 1 1 0 0 0 1 1 1 1 0 ...
    ..$ M031085 : num [1:698] 1 0 0 1 1 1 0 0 0 1 ...
    ..$ M031172 : num [1:698] 1 0 0 1 1 1 1 1 1 0 ...
    $ q.matrix : int [1:25, 1:15] 1 0 0 0 0 0 0 1 0 0 ...
    ..- attr(*, "dimnames")=List of 2
    .. ..$ : chr [1:25] "M041052" "M041056" "M041069" "M041076" ...
    .. ..$ : chr [1:15] "NWN01" "NWN02" "NWN03" "NWN04" ...
    $ skillinfo:'data.frame':
    ..$ skillindex : int [1:15] 1 2 3 4 5 6 7 8 9 10 ...
    ..$ skill : Factor w/ 15 levels "DOR15","DRI13",..: 12 13 14 15 8 9 10 11 4 6 ...
    ..$ content : Factor w/ 3 levels "D","G","N": 3 3 3 3 3 3 3 3 2 2 ...
    ..$ content_label : Factor w/ 3 levels "Data Display",..: 3 3 3 3 3 3 3 3 2 2 ...
    ..$ subcontent : Factor w/ 9 levels "FD","LA","LM",..: 9 9 9 9 1 1 4 6 2 8 ...
    ..$ subcontent_label: Factor w/ 9 levels "Fractions and Decimals",..: 9 9 9 9 1 1 4 6 2 8 ...

  • The dataset data.timss07.G4.py uses the same items as data.timss07.G4.lee but employs a simplified Q-matrix with 7 skills. This Q-matrix was used in Park and Lee (2014) and Park et al. (2018).

    List of 3
    $ q.matrix:'data.frame': 25 obs. of 7 variables:
    ..$ N1: num [1:25] 1 0 1 1 1 0 0 1 0 0 ...
    ..$ N2: num [1:25] 0 1 1 1 0 0 0 0 0 0 ...
    ..$ N3: num [1:25] 0 0 0 0 1 0 0 0 0 0 ...
    ..$ G4: num [1:25] 0 0 0 0 0 0 1 0 0 1 ...
    ..$ G5: num [1:25] 0 0 0 0 0 1 1 1 1 1 ...
    ..$ G6: num [1:25] 0 0 0 0 0 1 1 0 0 0 ...
    ..$ D7: num [1:25] 0 0 0 0 0 0 0 0 0 0 ...
    $ domains : Named chr [1:3] "Number" "Geometric Shapes and Measures" "Data Display"
    ..- attr(*, "names")=chr [1:3] "N" "G" "D"
    $ skills : Named chr [1:7] "Whole Numbers" ...
    ..- attr(*, "names")=chr [1:7] "N1" "N2" "N3" "G4" ...

  • The Q-matrix data.timss07.G4.Qdomains is a simplification of data.timss07.G4.py$q.matrix to 3 domains and involves a simple structure of skills.

    num [1:25, 1:3] 1 1 1 1 1 0 0 1 0 0 ...
    - attr(*, "dimnames")=List of 2
    ..$ : chr [1:25] "M041052" "M041056" "M041069" "M041076" ...
    ..$ : chr [1:3] "N" "G" "D"

Source

TIMSS 2007 study, 4th Grade, Austrian sample on booklets 4 and 5

See also

A comparison of several countries based on the 25 items is conducted in Yamaguchi and Okada (2018).

References

Lee, Y. S., Park, Y. S., & Taylan, D. (2011). A cognitive diagnostic modeling of attribute mastery in Massachusetts, Minnesota, and the US national sample using the TIMSS 2007. International Journal of Testing, 11, 144-177.

Park, Y. S., & Lee, Y. S. (2014). An extension of the DINA model using covariates: Examining factors affecting response probability and latent classification. Applied Psychological Measurement, 38(5), 376-390.

Park, Y. S., Xing, K., & Lee, Y. S. (2018). Explanatory cognitive diagnostic models: Incorporating latent and observed predictors. Applied Psychological Measurement, 42(5), 376-392.

Yamaguchi, K., & Okada, K. (2018). Comparison among cognitive diagnostic models for the TIMSS 2007 fourth grade mathematics assessment. PloS ONE, 13(2), e0188691.

Examples

if (FALSE) {
#############################################################################
# EXAMPLE 1: DINA model Lee et al. (2011) - 15 skills
#############################################################################

data(data.timss07.G4.lee, package="CDM")
dat <- data.timss07.G4.lee$data
q.matrix <- data.timss07.G4.lee$q.matrix
# extract items
items <- grep( "M0", colnames(dat), value=TRUE )

#*** Model 1: estimate DINA model
mod1 <- CDM::din( dat[,items], q.matrix )
summary(mod1)

#############################################################################
# EXAMPLE 2: DINA models Park and Lee (2014) - 7 skills and 3 skills
#############################################################################

data(data.timss07.G4.lee, package="CDM")
data(data.timss07.G4.py, package="CDM")
data(data.timss07.G4.Qdomains, package="CDM")

dat <- data.timss07.G4.lee$data
q.matrix <- data.timss07.G4.py$q.matrix
items <- rownames(q.matrix)

#*** Model 1: estimate DINA model
mod1 <- CDM::din( dat[,items], q.matrix )
summary(mod1)

#*** Model 2: estimate DINA model with Q-matrix defined by domains
Q <- data.timss07.G4.Qdomains
mod2 <- CDM::din( dat[,items], q.matrix=Q )
summary(mod2)
}