메뉴 건너뛰기




Volumn 18, Issue 3, 2011, Pages 319-341

A validity-based approach to quality control and assurance of automated scoring

Author keywords

Automated; Marking; Scoring

Indexed keywords


EID: 79961058822     PISSN: 0969594X     EISSN: 1465329X     Source Type: Journal    
DOI: 10.1080/0969594X.2011.555329     Document Type: Article
Times cited : (38)

References (94)
  • 2
    • 0040416315 scopus 로고    scopus 로고
    • American Institute of Architects, A. Pressman, and Smith Maran Architects, 11th ed. Hoboken, NJ: John Wiley & Sons
    • American Institute of Architects, A. Pressman, and Smith Maran Architects. 2007. Architectural graphic standards. 11th ed. Hoboken, NJ: John Wiley & Sons.
    • (2007) Architectural Graphic Standards
  • 3
    • 84993748560 scopus 로고    scopus 로고
    • Immediate feedback and opportunity to revise answers to open-ended questions
    • Attali, Y., and D. Powers. 2010. Immediate feedback and opportunity to revise answers to open-ended questions. Educational and Psychological Measurement 70, no. 1: 22-35.
    • (2010) Educational and Psychological Measurement , vol.70 , Issue.1 , pp. 22-35
    • Attali, Y.1    Powers, D.2
  • 4
    • 0005572097 scopus 로고
    • Automation of test scoring, reporting and analysis
    • 2nd ed., ed. R.L. Thorndike, Washington, DC: American Council on Education
    • Baker, F.B. 1971. Automation of test scoring, reporting and analysis. In Educational measurement, 2nd ed., ed. R.L. Thorndike, 202-236. Washington, DC: American Council on Education.
    • (1971) Educational Measurement , pp. 202-236
    • Baker, F.B.1
  • 5
    • 0001330641 scopus 로고
    • A methodology for scoring open-ended architectural design problems
    • Bejar, I.I. 1991. A methodology for scoring open-ended architectural design problems. Journal of Applied Psychology 76, no. 4: 522-532.
    • (1991) Journal of Applied Psychology , vol.76 , Issue.4 , pp. 522-532
    • Bejar, I.I.1
  • 6
    • 85142558141 scopus 로고    scopus 로고
    • Generative testing: From conception to implementation
    • ed. S.H. Irvine and P.C. Kyllonen, Mahwah, NJ: Erlbaum
    • Bejar, I.I. 2002. Generative testing: From conception to implementation. In Item generation for test development, ed. S.H. Irvine and P.C. Kyllonen, 199-218. Mahwah, NJ: Erlbaum.
    • (2002) Item Generation For Test Development , pp. 199-218
    • Bejar, I.I.1
  • 8
    • 79960492496 scopus 로고    scopus 로고
    • Updating the duplex design for test-based accountability in the twenty-first century
    • Bejar, I.I., and E.A. Graf. 2010. Updating the duplex design for test-based accountability in the twenty-first century. Measurement: Interdisciplinary Research & Perspective 8, no. 2: 110-129.
    • (2010) Measurement: Interdisciplinary Research & Perspective , vol.8 , Issue.2 , pp. 110-129
    • Bejar, I.I.1    Graf, E.A.2
  • 11
    • 0030555167 scopus 로고    scopus 로고
    • The accuracy of expert-system diagnoses of mathematical problem solutions
    • Bennett, R.E., and M.M. Sebrechts. 1996. The accuracy of expert-system diagnoses of mathematical problem solutions. Applied Measurement in Education 9, no. 2: 133-150.
    • (1996) Applied Measurement In Education , vol.9 , Issue.2 , pp. 133-150
    • Bennett, R.E.1    Sebrechts, M.M.2
  • 12
    • 0031165297 scopus 로고    scopus 로고
    • Evaluating an automatically scorable, open-ended response type for measuring mathematical reasoning in computer-adaptive tests
    • Bennett, R.E., M. Steffen, M.K. Singley, M. Morley, and D. Jacquemin. 1997. Evaluating an automatically scorable, open-ended response type for measuring mathematical reasoning in computer-adaptive tests. Journal of Educational Measurement 34, no. 2: 162-176.
    • (1997) Journal of Educational Measurement , vol.34 , Issue.2 , pp. 162-176
    • Bennett, R.E.1    Steffen, M.2    Singley, M.K.3    Morley, M.4    Jacquemin, D.5
  • 15
    • 79961044954 scopus 로고    scopus 로고
    • Rule-based methods for automated scoring: Applications in a licensing context
    • ed. D.M. Williamson, R.J. Mislevy, and I.I. Bejar, Mahwah, NJ: Lawrence Erlbaum
    • Braun, H., I.I. Bejar, and D.M. Williamson. 2006. Rule-based methods for automated scoring: Applications in a licensing context. In Automated scoring of complex tasks in computer- based testing, ed. D.M. Williamson, R.J. Mislevy, and I.I. Bejar, 83-122. Mahwah, NJ: Lawrence Erlbaum.
    • (2006) Automated Scoring of Complex Tasks In Computer- Based Testing , pp. 83-122
    • Braun, H.1    Bejar, I.I.2    Williamson, D.M.3
  • 19
    • 0032663837 scopus 로고    scopus 로고
    • Fairness issues in a computer-based architectural licensure examination
    • Bridgeman, B., I.I. Bejar, and D. Friedman. 1999. Fairness issues in a computer-based architectural licensure examination. Computers in Human Behavior 15, nos. 3/4: 419-440.
    • (1999) Computers In Human Behavior , vol.15 , Issue.3-4 , pp. 419-440
    • Bridgeman, B.1    Bejar, I.I.2    Friedman, D.3
  • 22
  • 23
    • 77953132417 scopus 로고    scopus 로고
    • A comparison of human and computer marking of short free-text student responses
    • Butcher, P.G., and S.E. Jordan. 2010. A comparison of human and computer marking of short free-text student responses. Computers in Education 55, no. 2: 489-499.
    • (2010) Computers In Education , vol.55 , Issue.2 , pp. 489-499
    • Butcher, P.G.1    Jordan, S.E.2
  • 25
    • 0037271619 scopus 로고    scopus 로고
    • Measuring accessibility for people with a disability
    • Church, R.L., and J.R. Marston. 2003. Measuring accessibility for people with a disability. Geographical Analysis 35, no. 1: 83-96.
    • (2003) Geographical Analysis , vol.35 , Issue.1 , pp. 83-96
    • Church, R.L.1    Marston, J.R.2
  • 27
    • 0036960581 scopus 로고    scopus 로고
    • Validity issues for performance-based tests scored with computer-automated scoring systems
    • Clauser, B.E., M.T. Kane, and D.B. Swanson. 2002. Validity issues for performance-based tests scored with computer-automated scoring systems. Applied Measurement in Education 15, no. 4: 413-432.
    • (2002) Applied Measurement In Education , vol.15 , Issue.4 , pp. 413-432
    • Clauser, B.E.1    Kane, M.T.2    Swanson, D.B.3
  • 29
    • 0002512639 scopus 로고
    • Essay examinations
    • 2nd ed., ed. R.L. Thorndike, Washington, DC: American Council on Education
    • Coffman, W.E. 1971. Essay examinations. In Educational measurement, 2nd ed., ed. R.L. Thorndike, 271-302. Washington, DC: American Council on Education.
    • (1971) Educational Measurement , pp. 271-302
    • Coffman, W.E.1
  • 30
    • 58149438731 scopus 로고
    • Construct validity in psychological tests
    • Cronbach, L.J., and P.E. Meehl. 1955. Construct validity in psychological tests. Psychological Bulletin 52, no. 4: 281-302.
    • (1955) Psychological Bulletin , vol.52 , Issue.4 , pp. 281-302
    • Cronbach, L.J.1    Meehl, P.E.2
  • 31
    • 33745872970 scopus 로고    scopus 로고
    • Studying aesthetics in photographic images using a computational approach
    • Proceedings of the European Conference on Computer Vision, Part III, ed. A. Leonardis, B. Horst, and A. Pinz, Graz, Austria: Springer-Verlag
    • Datta, R., D.J. Joshi, J. Li, and J.W. Wang. 2006. Studying aesthetics in photographic images using a computational approach. In Lecture Notes in Computer Science, vol. 3953, Proceedings of the European Conference on Computer Vision, Part III, ed. A. Leonardis, B. Horst, and A. Pinz, 288-301. Graz, Austria: Springer-Verlag.
    • (2006) Lecture Notes In Computer Science , vol.3953 , pp. 288-301
    • Datta, R.1    Joshi, D.J.2    Li, J.3    Wang, J.W.4
  • 32
    • 73149115409 scopus 로고    scopus 로고
    • Strategies for evidence identification through linguistic assessment of textual responses
    • ed. D.M. Williamson, R.J. Mislevy, and I.I. Bejar, Mahwah, NJ: Lawrence Erlbaum Associates
    • Deane, P. 2006. Strategies for evidence identification through linguistic assessment of textual responses. In Automated scoring of complex tasks in computer-based testing, ed. D.M. Williamson, R.J. Mislevy, and I.I. Bejar, 313-371. Mahwah, NJ: Lawrence Erlbaum Associates.
    • (2006) Automated Scoring of Complex Tasks In Computer-based Testing , pp. 313-371
    • Deane, P.1
  • 34
    • 58149367997 scopus 로고
    • Construct validity: Construct representation versus nomothetic span
    • Embretson, S.E. 1983. Construct validity: Construct representation versus nomothetic span. Psychological Bulletin 93, no. 1: 179-197.
    • (1983) Psychological Bulletin , vol.93 , Issue.1 , pp. 179-197
    • Embretson, S.E.1
  • 35
    • 67650581742 scopus 로고    scopus 로고
    • An overview of spoken language technology for education
    • Eskenazi, M. 2009. An overview of spoken language technology for education. Speech Communication 51, no. 10: 832-844.
    • (2009) Speech Communication , vol.51 , Issue.10 , pp. 832-844
    • Eskenazi, M.1
  • 37
    • 79961053839 scopus 로고    scopus 로고
    • Paper presented at the National Council of Measurement in Education, April, in New Orleans, LA
    • Fife, J.H. 2011. Integrating item generation and automated scoring. Paper presented at the National Council of Measurement in Education, April, in New Orleans, LA.
    • (2011) Integrating Item Generation and Automated Scoring
    • Fife, J.H.1
  • 38
    • 0034140838 scopus 로고    scopus 로고
    • Combination of machine scores for automatic grading of pronunciation quality
    • Franco, H., L. Neumeyer, V. Digalakis, and O. Ronen. 2000. Combination of machine scores for automatic grading of pronunciation quality. Speech Communication 30, nos. 2/3: 121-130.
    • (2000) Speech Communication , vol.30 , Issue.2-3 , pp. 121-130
    • Franco, H.1    Neumeyer, L.2    Digalakis, V.3    Ronen, O.4
  • 39
    • 5044244160 scopus 로고    scopus 로고
    • Interface design in computer-based language testing
    • Fulcher, G. 2003. Interface design in computer-based language testing. Language Testing 20, no. 4: 384-408.
    • (2003) Language Testing , vol.20 , Issue.4 , pp. 384-408
    • Fulcher, G.1
  • 40
    • 79961078175 scopus 로고    scopus 로고
    • Paper presented at the International Association of Educational Assessment, September, in Singapore
    • Haggie, D. 2008. The strategic use of marking technologies to support innovation and diversity in assessment. Paper presented at the International Association of Educational Assessment, September, in Singapore. http://www.iaea2008.cambridgeassessment.org.uk/ca/digitalAssets/180436_Haggie.pdf.
    • (2008) The Strategic Use of Marking Technologies to Support Innovation and Diversity In Assessment
    • Haggie, D.1
  • 41
    • 36148976476 scopus 로고    scopus 로고
    • Setting performance standards
    • 4th ed., ed. R.L. Brennan, Westport, CT: Praeger
    • Hambleton, R.K., and M. Pitoniak. 2006. Setting performance standards. In Educational measurement, 4th ed., ed. R.L. Brennan, 433-470. Westport, CT: Praeger.
    • (2006) Educational Measurement , pp. 433-470
    • Hambleton, R.K.1    Pitoniak, M.2
  • 44
    • 0002799653 scopus 로고    scopus 로고
    • On Bayesian analysis of multirater ordinal data: An application to automated essay grading
    • Johnson, V.E. 1996. On Bayesian analysis of multirater ordinal data: An application to automated essay grading. Journal of the American Statistical Association 91, no. 433: 42-51.
    • (1996) Journal of the American Statistical Association , vol.91 , Issue.433 , pp. 42-51
    • Johnson, V.E.1
  • 47
    • 33846423101 scopus 로고    scopus 로고
    • Validation
    • 4th ed., ed. R.L. Brennan, Westport, CT: Praeger
    • Kane, M.T. 2006. Validation. In Educational measurement, 4th ed., ed. R.L. Brennan, 17-64. Westport, CT: Praeger.
    • (2006) Educational Measurement , pp. 17-64
    • Kane, M.T.1
  • 49
    • 0003093853 scopus 로고    scopus 로고
    • New testing methodologies for the Architect Registration Examination
    • Kenney, J.F. 1997. New testing methodologies for the Architect Registration Examination. CLEAR Exam Review 8, no. 2: 23-28.
    • (1997) CLEAR Exam Review , vol.8 , Issue.2 , pp. 23-28
    • Kenney, J.F.1
  • 50
    • 76349113647 scopus 로고    scopus 로고
    • Performance assessment
    • 4th ed., ed. R.L. Brennan, Westport, CT: Praeger
    • Lane, E.S., and C.A. Stone. 2006. Performance assessment. In Educational measurement, 4th ed., ed. R.L. Brennan, 387-431. Westport, CT: Praeger.
    • (2006) Educational Measurement , pp. 387-431
    • Lane, E.S.1    Stone, C.A.2
  • 51
    • 33646866698 scopus 로고    scopus 로고
    • C-rater: Scoring of short-answer questions
    • Leacock, C., and M. Chodorow. 2003. C-rater: Scoring of short-answer questions. Computers and the Humanities 37, no. 4: 389-405.
    • (2003) Computers and The Humanities , vol.37 , Issue.4 , pp. 389-405
    • Leacock, C.1    Chodorow, M.2
  • 52
    • 70349257140 scopus 로고    scopus 로고
    • A regression-based procedure for automated scoring of a complex medical performance assessment
    • ed. D. Williamson, R.J. Mislevy, and I.I. Bejar, Mahwah, NJ: Lawrence Erlbaum Associates
    • Margolis, M.J., and B.E. Clauser. 2006. A regression-based procedure for automated scoring of a complex medical performance assessment. In Automated scoring of complex tasks in computer-based testing, ed. D. Williamson, R.J. Mislevy, and I.I. Bejar, 123-167. Mahwah, NJ: Lawrence Erlbaum Associates.
    • (2006) Automated Scoring of Complex Tasks In Computer-based Testing , pp. 123-167
    • Margolis, M.J.1    Clauser, B.E.2
  • 54
    • 0002353120 scopus 로고
    • Validity
    • 3rd ed., ed. R.L. Linn, New York: American Council on Education
    • Messick, S. 1989. Validity. In Educational measurement, 3rd ed., ed. R.L. Linn, 13-103. New York: American Council on Education.
    • (1989) Educational Measurement , pp. 13-103
    • Messick, S.1
  • 55
    • 33846454283 scopus 로고    scopus 로고
    • Cognitive psychology and educational assessment
    • 4th ed., ed. R.L. Brennan, Westport, CT: Praeger
    • Mislevy, R.J. 2006. Cognitive psychology and educational assessment. In Educational measurement, 4th ed., ed. R.L. Brennan, 257-306. Westport, CT: Praeger.
    • (2006) Educational Measurement , pp. 257-306
    • Mislevy, R.J.1
  • 56
  • 57
    • 40749132977 scopus 로고    scopus 로고
    • Concepts, terminology, and basic models of evidence-centered design
    • ed. D. Williamson, R.J. Mislevy, and I.I. Bejar, Mahwah, NJ: Lawrence Erlbaum
    • Mislevy, R.J., L. Steinberg, R.G. Almond, and J.F. Lucas. 2006. Concepts, terminology, and basic models of evidence-centered design. In Automated scoring of complex tasks in computer-based testing, ed. D. Williamson, R.J. Mislevy, and I.I. Bejar, 49-82. Mahwah, NJ: Lawrence Erlbaum.
    • (2006) Automated Scoring of Complex Tasks In Computer-based Testing , pp. 49-82
    • Mislevy, R.J.1    Steinberg, L.2    Almond, R.G.3    Lucas, J.F.4
  • 58
    • 85142569132 scopus 로고    scopus 로고
    • On the roles of task model variables in assessment design
    • ed. S.H. Irvine and P.C. Kyllonen, Mahwah, NJ: Lawrence Erlbaum Associates
    • Mislevy, R.J., L.S. Steinberg, and R.G. Almond. 2002. On the roles of task model variables in assessment design. In Item generation for test development, ed. S.H. Irvine and P.C. Kyllonen, 97-128. Mahwah, NJ: Lawrence Erlbaum Associates.
    • (2002) Item Generation For Test Development , pp. 97-128
    • Mislevy, R.J.1    Steinberg, L.S.2    Almond, R.G.3
  • 61
    • 78650765024 scopus 로고    scopus 로고
    • Paper presented at the 12th Conference of the European Chapter of the Association for Computational Linguistics (EACL 2009), March, in Athens, Greece
    • Mohler, M., and R. Mihalcea. 2009. Text-to-text semantic similarity for automatic short answer grading. Paper presented at the 12th Conference of the European Chapter of the Association for Computational Linguistics (EACL 2009), March, in Athens, Greece.
    • (2009) Text-to-text Semantic Similarity For Automatic Short Answer Grading
    • Mohler, M.1    Mihalcea, R.2
  • 62
    • 77954299851 scopus 로고    scopus 로고
    • Princeton, NJ: Educational Testing Service, accessed February 13, 2009
    • Monaghan, W., and B. Bridgeman. 2005. E-rater as a quality control of human scores. Princeton, NJ: Educational Testing Service. http://www.ets.org/Media/Research/pdf/RD_Connections2.pdf (accessed February 13, 2009).
    • (2005) E-rater As a Quality Control of Human Scores
    • Monaghan, W.1    Bridgeman, B.2
  • 63
    • 79961059663 scopus 로고    scopus 로고
    • National Board of Medical Examiners, Philadelphia, PA: NBME
    • National Board of Medical Examiners. 2009. Web-based testing on wireless networks. Philadelphia, PA: NBME. http://www.nbme.org/PDF/CAS/WirelessCAS.pdf.
    • (2009) Web-based Testing On Wireless Networks
  • 64
    • 0009272924 scopus 로고
    • An approach to automated scoring of architectural designs
    • ed. U. Flemming and S. van Wyk, Pittsburgh, PA: North-Holland
    • Oltman, P.K., I.I. Bejar, and S.H. Kim. 1993. An approach to automated scoring of architectural designs. In CAAD Futures 93, ed. U. Flemming and S. van Wyk, 215-224. Pittsburgh, PA: North-Holland.
    • (1993) CAAD Futures , vol.93 , pp. 215-224
    • Oltman, P.K.1    Bejar, I.I.2    Kim, S.H.3
  • 65
    • 0001596906 scopus 로고
    • The imminence of grading essays by computer
    • Page, E.B. 1966. The imminence of grading essays by computer. Phi Delta Kappan 47: 238-243.
    • (1966) Phi Delta Kappan , vol.47 , pp. 238-243
    • Page, E.B.1
  • 68
    • 79961042079 scopus 로고    scopus 로고
    • Paper presented at the second workshop on Building Educational Applications Using NLP, June, in Ann Arbor, MI
    • Pulman, S.G., and J.Z. Sukkarieh. 2005. Automatic short answer marking. Paper presented at the second workshop on Building Educational Applications Using NLP, June, in Ann Arbor, MI. http://www.clg.ox.ac.uk/pulman/pdfpapers/acl05.pdf.
    • (2005) Automatic Short Answer Marking
    • Pulman, S.G.1    Sukkarieh, J.Z.2
  • 73
    • 79961062908 scopus 로고    scopus 로고
    • Research Report No. RR-96-18. Princeton, NJ: Educational Testing Service
    • Reid-Green, K.S. 1996b. Insolation and shadow. Research Report No. RR-96-18. Princeton, NJ: Educational Testing Service.
    • (1996) Insolation and Shadow
    • Reid-Green, K.S.1
  • 77
    • 33745443899 scopus 로고    scopus 로고
    • Analysis and comparison of automated scoring approaches: Addressing evidence-based assessment principles
    • ed. D. Williamson, R.J. Mislevy, and I.I. Bejar, Mahwah, NJ: Lawrence Erlbaum
    • Scalise, K., and M. Wilson. 2006. Analysis and comparison of automated scoring approaches: Addressing evidence-based assessment principles. In Automated scoring of complex tasks in computer-based testing, ed. D. Williamson, R.J. Mislevy, and I.I. Bejar, 373-401. Mahwah, NJ: Lawrence Erlbaum.
    • (2006) Automated Scoring of Complex Tasks In Computer-based Testing , pp. 373-401
    • Scalise, K.1    Wilson, M.2
  • 80
    • 70350518126 scopus 로고    scopus 로고
    • Paper presented at the 22nd International Conference for the Florida Artificial Intelligence Research Society, FLAIRS 2009, May, in Florida, USA
    • Sukkarieh, J.Z., and J. Blackmore. 2009. C-rater: Automatic content scoring of short constructed responses. Paper presented at the 22nd International Conference for the Florida Artificial Intelligence Research Society, FLAIRS 2009, May, in Florida, USA.
    • (2009) C-rater: Automatic Content Scoring of Short Constructed Responses
    • Sukkarieh, J.Z.1    Blackmore, J.2
  • 81
  • 83
    • 58449115371 scopus 로고    scopus 로고
    • Using a wiki to evaluate individual contribution to a collaborative learning project
    • Trentin, G. 2009. Using a wiki to evaluate individual contribution to a collaborative learning project. Journal of Computer Assisted Learning 25, no. 1: 43-55.
    • (2009) Journal of Computer Assisted Learning , vol.25 , Issue.1 , pp. 43-55
    • Trentin, G.1
  • 85
    • 1442275185 scopus 로고    scopus 로고
    • Learning when training data are costly: The effect of class distribution on tree induction
    • Weiss, G.M., and F. Provost. 2003. Learning when training data are costly: The effect of class distribution on tree induction. Journal of Artificial Intelligence Research 19: 315-354.
    • (2003) Journal of Artificial Intelligence Research , vol.19 , pp. 315-354
    • Weiss, G.M.1    Provost, F.2
  • 86
    • 37649014994 scopus 로고    scopus 로고
    • An application of Bayesian networks in automated scoring of computerized simulation tasks
    • ed. D. Williamson, R.J. Mislevy, and I.I. Bejar, Mahwah, NJ: Lawrence Erlbaum
    • Williamson, D.M., R.G. Almond, R.J. Mislevy, and R. Levy. 2006. An application of Bayesian networks in automated scoring of computerized simulation tasks. In Automated scoring of complex tasks in computer-based testing, ed. D. Williamson, R.J. Mislevy, and I.I. Bejar, 201-258. Mahwah, NJ: Lawrence Erlbaum.
    • (2006) Automated Scoring of Complex Tasks In Computer-based Testing , pp. 201-258
    • Williamson, D.M.1    Almond, R.G.2    Mislevy, R.J.3    Levy, R.4
  • 87
    • 8744280233 scopus 로고    scopus 로고
    • Automated tools for subject matter expert evaluation of automated scoring
    • Williamson, D.M., I.I. Bejar, and A. Sax. 2004. Automated tools for subject matter expert evaluation of automated scoring. Applied Measurement in Education 17, no. 4: 323-357.
    • (2004) Applied Measurement In Education , vol.17 , Issue.4 , pp. 323-357
    • Williamson, D.M.1    Bejar, I.I.2    Sax, A.3
  • 94
    • 67650710841 scopus 로고    scopus 로고
    • Automatic scoring of nonnative spontaneous speech in tests of spoken English
    • Zechner, K., D. Higgins, X. Xiaoming, and D. Williamson. 2009. Automatic scoring of nonnative spontaneous speech in tests of spoken English. Speech Communication 51, no. 10: 883-895.
    • (2009) Speech Communication , vol.51 , Issue.10 , pp. 883-895
    • Zechner, K.1    Higgins, D.2    Xiaoming, X.3    Williamson, D.4


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.