Survey of classical test theory and more recent developments in item analysis and test construction.

 

Fall, 2018
Time, Place: 9:00-11:30 Wednesdays, 347 Davie
David Thissen

 

Paper textbooks:

DeVellis, R.F. (2017). Scale Development: Theory and Applications (Fourth Edition). Thousand Oaks, CA: Sage Publications, Inc.
Raykov, T., and Marcoulides, G.A. (2011). Introduction to psychometric theory. New York, NY: Routledge.

From the web:

Revelle, W. (under development). An introduction to psychometric theory with applications in R.
and additional materials for specific dates, below.

 

Week 1: Course Introduction, Measurement, Data, and Scaling
August 22

DeVellis Ch 1 Overview

Raykov & Marcoulides Ch 1 Measurement, Measuring Instruments, and Psychometric Theory

 

Week 2: Scores Scales & Regression
August 29

Raykov & Marcoulides Ch 2 Basic Statistical Concepts and Relationships

Kolen, M.J. (2006). Scaling and Norming. In R. L. Brennan (Ed.), Educational measurement (4th edition), (pp. 155-186). Westport, CT: Praeger

 

Week 3: Factor Analysis
September 5

DeVellis Ch 6 Factor Analysis

Raykov & Marcoulides Ch 3 An Introduction to Factor Analysis

 

Week 4: Confirmatory Factor Analysis
September 12

Raykov & Marcoulides Ch 4 Introduction to Latent Variable Modeling and Confirmatory Factor Analysis

 

No class
September 19

No class

NC TA meeting

 

Week 5: Classical Test Theory; Reliability
September 26

DeVellis Ch 2 Understanding the Latent Variable
DeVellis Ch 3 Reliability

Raykov & Marcoulides Ch 5 Classical Test Theory (pp. 115-136)
Raykov & Marcoulides Ch 6 Reliability
Raykov & Marcoulides Ch 7 Procedures for Estimating Reliability

Haertel, E. H. (2006).Reliability. In R. L. Brennan (Ed.), Educational measurement (4th ed., pp. 65–110). Westport, CT: Praeger.
Feldt, L.S. & Brennan, R.L. (1989). Reliability. In R.L. Linn (Ed.), Educational measurement (3rd Ed.), (pp. 105-146). New York: American Council on Education/Macmillan

 

No class
October 3

No class

ELPA21 / NVS meetings

 

Week 6: Validity
October 10

DeVellis Ch 4 Validity

Raykov & Marcoulides Ch 8 Validity

Cronbach, L.J. (1988). Five perspectives on validity argument. In H. Wainer & H.I. Braun (Eds.), Test validity (pp. 3-17). Hillsdale, NJ: Lawrence Erlbaum Associates, Inc.
Angoff, W.H. (1988). Validity: An evolving concept. In H. Wainer & H.I. Braun (Eds.), Test validity (pp. 19-32). Hillsdale, NJ: Lawrence Erlbaum Associates, Inc.
Messick, S. (1988). The once and future issues of validity: Assessing the meaning and consequences of measurement. In H. Wainer & H.I. Braun (Eds.), Test validity (pp. 33-45). Hillsdale, NJ: Lawrence Erlbaum Associates, Inc.
Messick, S. (1989). Validity. In R. L. Linn (Ed.), Educational measurement (3rd ed., pp. 13–103). New York: American Council on Education/Macmillan
Kane, M. T. (2006). Validation. In R. L. Brennan (Ed.), Educational measurement (4th edition), (pp. 18-64). Westport, CT: Praeger

 

Week 7: Item Response Theory I: Dichotomous and Polytomous Models
October 17

DeVellis Ch 7 Overview of IRT

Raykov & Marcoulides Ch 10 Introduction to Item Response Theory
Ch 11 Fundamentals and Models of Item Response Theory

Thissen, D. & Steinberg, L. (2009). Item response theory. In R. Millsap & A. Maydeu-Olivares, The Sage handbook of quantitative methods in psychology (Pp. 148-177). London: Sage Publications.
Steinberg, L., & Thissen, D. (2013). Item response theory. In J. Comer & P. Kendall (Eds.), The Oxford handbook of research strategies for clinical psychology (Pp. 336-373). New York, NY: Oxford University Press.

 

Week 8: Item Response Theory II: Multidimensional Models and DIF
October 24

Liu, Y., Magnus, B., Quinn, H., & Thissen, D. (2018). Multidimensional item response theory. In P. Irwing, T. Booth & D. Hughes (Eds.), The Wiley handbook of psychometric testing: A multidisciplinary reference on survey, scale and test development, Volume II (Pp. 445-493). London: John Wiley & Sons.
Edwards, M.C. & Edelen, Maria Orlando. (2009). Special Topics in Item response theory. In R. Millsap & A. Maydeu-Olivares, The Sage handbook of quantitative methods in psychology. London: Sage Publications.
Steinberg, L., & Thissen, D. (2006) Using Effect Sizes for Research Reporting: Examples using Item Response Theory to Analyze Differential Item Functioning. Psychological Methods, 11, 402-415.

 

Week 9: Test Construction , Reliability redux
October 31

DeVellis Ch 5 Guidelines in Scale Development

Reeve, B.B., Hays, R.D, Bjorner, J.B., Cook K.F., Crane, P.K., Teresi, J.A., Thissen, D., Revicki, D.A., Weiss, D.J., Hambleton, R.K., Liu, H., Gershon, R., Reise, S.P., & Cella, D (2007). Psychometric evaluation and calibration of health-related quality of life items banks: Plans for the patient-reported outcome measurement information system (PROMIS). Medical Care, 45, S22-31.
Irwin, D.E., Stucky, B.D., Thissen, D., DeWitt, E.M., Lai, J.S.,Yeatts, K., Varni, J.W., DeWalt, D.A. (2010). Sampling Plan and Patient Characteristics of the PROMIS Pediatrics Large-Scale Survey. Quality of Life Research, 19, 585-594
Irwin, D.E., Stucky, B.D., Langer, M.M., Thissen, D., DeWitt, E.M., Lai, J.S., Varni, J.W., Yeatts, K., DeWalt, D.A. (2010). An Item Response Analysis of the Pediatric PROMIS Anxiety and Depressive Symptoms Scales. Quality of Life Research, 19, 595-607.

 

Week 10: Special topics: Computerized Adaptive Testing
November 7

Thissen, D., Reeve, B.B., Bjorner, J.B., & Chang, C.-H. (2007). Methodological issues for building item banks and computerized adaptive scales. Quality of Life Research, 16, 109-116.
Bjorner, J.B., & Chang, C.-H., Thissen, D., Reeve, B.B. (2007). Developing tailored instruments: item banking and computerized adaptive assessment. Quality of Life Research, 16, 95-108.
Edwards, M.C., Flora, D.B., & Thissen, D. (2012). Multi-stage computerized adaptive testing with uniform item exposure. Applied Measurement in Education, 25, 118-141.

 

Week 11: Special topics: Linking
November 14

Williams, V.S.L., Pommerich, M., & Thissen, D. (1998). A comparison of developmental scales based on Thurstone methods and item response theory. Journal of Educational Measurement, 35, 93-107.
Thissen, D. (2012). Validity issues involved in cross-grade statements about NAEP results. Washington, DC: American Institutes for Research, NAEP Validity Studies Panel.
Williams, V.S.L., Rosa, K.R., McLeod, L.D., Thissen, D., & Sanford, E. (1998). Projecting to the NAEP scale: Results from the North Carolina End-of-Grade testing program. Journal of Educational Measurement, 35, 277-296.
Thissen, D. (2007). Linking assessments based on aggregate reporting: Background and issues. In N.J. Dorans, M. Pommerich, & P.W. Holland (Eds.) Linking and aligning scores and scales (Pp. 287-312). New York, NY: Springer.
Linn, R.L., McLaughlin, D., & Thissen, D. (2009). Utility and validity of NAEP linking efforts. Washington, DC: American Institutes for Research, NAEP Validity Studies Panel.

 

No class
November 21

Thanksgiving break

 

Week 12: Class presentations
November 28

 

Week 13: Class presentations
December 5

 

Required activities:
  1. The first is showing up. For better or for worse, the class meeting is part of the class. Even if it seems boring at the time, it may inspire some thought later. And there may be interaction.
  2. Readings/Questions. Do whatever is best for you, you are all different. Yes, there is (potentially) a great deal of reading. But there will be no tests (it is a class on testing, not of testing). Read smart. Read quickly, to comprehend as much as you need to for your purposes. Read to remember the ideas and where to find the details, not to memorize details. This is how professional life is. However, you must do enough of the readings each week to formulate a question inspired by the readings. This question may be one of clarification, or one that suggests some lack of clarity or completeness in the readings. (You may formulate more than one question, but you are only required to submit one each week.)
    1. Your question(s) must be emailed to me (dthissen@email.unc.edu) by noon ET on the Tuesday before each Wednesday class.
    2. I may respond to (some of) the questions in class, but …
    3. Your questions will be rated for degree of insight. There will be points: 5, 4, 3, 2, and 1 for the highest ranked question, second highest, etc. down to 5th. Other questions will receive no points. Questions that are answered elsewhere in the readings will receive no points. This points business is to motivate those of you who are inspired by competition.
    4. To facilitate announcement of the points without the unintended consequence of embarrassment, each student will select a “call sign” (for fans of the old movie “Top Gun”) or “character name / username” (for online gamers); points will be listed with your usernames. While you may choose to reveal your username identities, it would probably be best if you do not. You specify your username in the email missive with your first question(s).
  3. A paper (about 10-20 pages, double-spaced) on any topic related to the course material. You are strongly encouraged to plan your paper such that it is relevant to your own research activities. You need to have your final paper topic approved. Papers are due on the last day of class. Your paper may be your answer to one of your questions in part (2), above, or, indeed, someone else’s question!
  4. An in-class presentation (probably around 15-20 minutes total, including questions, so short) summarizing your paper: Your question and its answer. This is modeled after a conference presentation in length. Short is difficult to do. It usually takes a month to create a short presentation, so plan ahead.