메뉴 건너뛰기




Volumn , Issue , 2003, Pages 21-36

Issues in the Reliability and Validity of Automated Scoring of Constructed Responses

Author keywords

[No Author keywords available]

Indexed keywords


EID: 85142566672     PISSN: None     EISSN: None     Source Type: Book    
DOI: 10.4324/9781410606860-10     Document Type: Chapter
Times cited : (27)

References (57)
  • 1
    • 32844464722 scopus 로고    scopus 로고
    • (CSE Tech. Rep. No. 543). Los Angeles: University of California, National Center for Research on Evaluation, Standards, and Student Testing (CRESST)
    • Almond, R., Steinberg, L., and Mislevy, R. (2001). A sample assessment using the four process framework (CSE Tech. Rep. No. 543). Los Angeles: University of California, National Center for Research on Evaluation, Standards, and Student Testing (CRESST).
    • (2001) A sample assessment using the four process framework
    • Almond, R.1    Steinberg, L.2    Mislevy, R.3
  • 2
    • 0003600480 scopus 로고    scopus 로고
    • Washington, DC: American Educational Research Association
    • American Educational Research Association, American Psychological Association, and National Council for Measurement in Education. (1999). Standards for educational and psychological testing. Washington, DC: American Educational Research Association.
    • (1999) Standards for educational and psychological testing
  • 3
    • 21344488404 scopus 로고
    • Learning based assessments of history understanding
    • Baker, E.L. (1995). Learning based assessments of history understanding. Educational Psychologist, 29, 97-106.
    • (1995) Educational Psychologist , vol.29 , pp. 97-106
    • Baker, E.L.1
  • 4
    • 0002491078 scopus 로고
    • Cognitive assessment of history for large-scale testing
    • M.C.Wittrock and E.L.Baker (Eds.), Englewood Cliffs, NJ: Prentice-Hall
    • Baker, E.L., Freeman, M., and Clayton, S. (1991). Cognitive assessment of history for large-scale testing. In M.C.Wittrock and E.L.Baker (Eds.), Testing and cognition (pp. 131-153). Englewood Cliffs, NJ: Prentice-Hall.
    • (1991) Testing and cognition , pp. 131-153
    • Baker, E.L.1    Freeman, M.2    Clayton, S.3
  • 6
    • 0013292154 scopus 로고
    • Dimensionality and generalizability of domain-independent performance assessments
    • Baker, E.L., Linn, R.L., Abedi, J., and Niemi, D. (1995). Dimensionality and generalizability of domain-independent performance assessments. Journal of Educational Research, 89, 197-205.
    • (1995) Journal of Educational Research , vol.89 , pp. 197-205
    • Baker, E.L.1    Linn, R.L.2    Abedi, J.3    Niemi, D.4
  • 7
    • 0032647514 scopus 로고    scopus 로고
    • Computer-based assessment of problem solving
    • Baker, E.L., and Mayer, R.E. (1999). Computer-based assessment of problem solving. Computers in Human Behavior, 15, 269-282.
    • (1999) Computers in Human Behavior , vol.15 , pp. 269-282
    • Baker, E.L.1    Mayer, R.E.2
  • 9
    • 0000029547 scopus 로고
    • Policy and validity prospects for performancebased assessment
    • Baker, E.L., O’Neil, H.F., Jr., and Linn, R.L. (1993). Policy and validity prospects for performancebased assessment. American Psychologist, 48, 1210-1218.
    • (1993) American Psychologist , vol.48 , pp. 1210-1218
    • Baker, E.L.1    O’Neil, H.F.2    Linn, R.L.3
  • 11
    • 0011589669 scopus 로고
    • Toward intelligent assessment: An integration of constructed-response testing, artificial intelligence, and model-based measurement
    • N. Frederiksen, R.J.Mislevy, and I.I.Bejar (Eds.), Hillsdale, NJ: Lawrence Erlbaum Associates, Inc
    • Bennett, R.E. (1993b). Toward intelligent assessment: An integration of constructed-response testing, artificial intelligence, and model-based measurement. In N. Frederiksen, R.J.Mislevy, and I.I.Bejar (Eds.), Test theory for a new generation of tests (pp. 99-123). Hillsdale, NJ: Lawrence Erlbaum Associates, Inc.
    • (1993) Test theory for a new generation of tests , pp. 99-123
    • Bennett, R.E.1
  • 14
    • 0010437782 scopus 로고    scopus 로고
    • April, Paper presented at the annual meeting of the National Council on Measurement in Education, Seattle, WA
    • Burstein, J. (2001, April). Automated essay evaluation with natural language processing. Paper presented at the annual meeting of the National Council on Measurement in Education, Seattle, WA.
    • (2001) Automated essay evaluation with natural language processing
    • Burstein, J.1
  • 15
    • 21644443051 scopus 로고    scopus 로고
    • Automated essay scoring for nonnative English speakers
    • June, Joint symposium of the Association of Computational Linguistics and the International Association of Language Learning Technologies, College Park, MD
    • Burstein, J.C., and Chodorow, M. (1999, June). Automated essay scoring for nonnative English speakers. In Computer-mediate language assessment and evaluation of natural language processing Joint symposium of the Association of Computational Linguistics and the International Association of Language Learning Technologies, College Park, MD.
    • (1999) Computer-mediate language assessment and evaluation of natural language processing
    • Burstein, J.C.1    Chodorow, M.2
  • 18
    • 38949086342 scopus 로고    scopus 로고
    • Using lexical semantic techniques to classify freeresponses
    • E.Viegas (Ed.), New York:Kluwer
    • Burstein, J., Wolff, S., and Lu, C. (1999). Using lexical semantic techniques to classify freeresponses. In E.Viegas (Ed.), Breadth and depth of semantic lexicons (pp. 227-246). New York:Kluwer.
    • (1999) Breadth and depth of semantic lexicons , pp. 227-246
    • Burstein, J.1    Wolff, S.2    Lu, C.3
  • 21
    • 0035510581 scopus 로고    scopus 로고
    • The impact of a simulation-based learning design project on student learning
    • Chung, G.K.W.K., Harmon, T.C., and Baker, E.L. (2001). The impact of a simulation-based learning design project on student learning. IEEE transactions on Education, 44, 390-398.
    • (2001) IEEE transactions on Education , vol.44 , pp. 390-398
    • Chung, G.K.W.K.1    Harmon, T.C.2    Baker, E.L.3
  • 22
    • 0034337121 scopus 로고    scopus 로고
    • Recurrent issues and recent advances in scoring performance assessments
    • Clauser, B. E, (2000). Recurrent issues and recent advances in scoring performance assessments. Applied Psychological Measurement, 24, 310-324.
    • (2000) Applied Psychological Measurement , vol.24 , pp. 310-324
    • Clauser, B.E.1
  • 23
    • 0031287726 scopus 로고    scopus 로고
    • Development of automated scoring algorithms for complex performance assessments: A comparison of two approaches
    • Clauser, B.E., Margolis, M.J., Clyman, S.G., and Ross, L.P. (1997). Development of automated scoring algorithms for complex performance assessments: A comparison of two approaches. Journal Educational Measurement, 34, 141-161.
    • (1997) Journal Educational Measurement , vol.34 , pp. 141-161
    • Clauser, B.E.1    Margolis, M.J.2    Clyman, S.G.3    Ross, L.P.4
  • 25
    • 0041526020 scopus 로고    scopus 로고
    • A comparison of the generalizability of scores produced by expert raters and automated scoring systems
    • Clauser, B.E., Swanson, D.B., and Clyman, S.G. (1999). A comparison of the generalizability of scores produced by expert raters and automated scoring systems. Applied Measurement in Education, 12, 281-299.
    • (1999) Applied Measurement in Education , vol.12 , pp. 281-299
    • Clauser, B.E.1    Swanson, D.B.2    Clyman, S.G.3
  • 27
    • 0032343060 scopus 로고    scopus 로고
    • A cognitive design system approach to generating valid tests: Application to abstract reasoning
    • Embretson, S.E. (1998). A cognitive design system approach to generating valid tests: Application to abstract reasoning. Psychological Methods, 5, 380-396.
    • (1998) Psychological Methods , vol.5 , pp. 380-396
    • Embretson, S.E.1
  • 28
    • 0000580760 scopus 로고    scopus 로고
    • Construct validation of an approach to modeling cognitive structure of U.S. history knowledge
    • Herl, H.E., Niemi, D., and Baker, E.L. (1996). Construct validation of an approach to modeling cognitive structure of U.S. history knowledge. Journal of Educational Research, 89, 206-218.
    • (1996) Journal of Educational Research , vol.89 , pp. 206-218
    • Herl, H.E.1    Niemi, D.2    Baker, E.L.3
  • 29
    • 0032662017 scopus 로고    scopus 로고
    • Reliability and validity of a computer-based knowledge mapping system to measure content understanding
    • Herl, H.E., O’Neil, H.F., Jr., Chung, G.K.W.K., and Schacter, J. (1999). Reliability and validity of a computer-based knowledge mapping system to measure content understanding. Computers in Human Behavior, 15, 315-334.
    • (1999) Computers in Human Behavior , vol.15 , pp. 315-334
    • Herl, H.E.1    O’Neil, H.F.2    Chung, G.K.W.K.3    Schacter, J.4
  • 31
    • 1042287607 scopus 로고    scopus 로고
    • An integrated judgment procedure for setting standards on complex, large-scale assessments
    • G.J.Cizek (Ed.), Mahwah, NJ: Lawrence Erlbaum Associates, Inc
    • Jaeger, R.M., and Craig, N. (2001). An integrated judgment procedure for setting standards on complex, large-scale assessments. In G.J.Cizek (Ed.), Setting performance standards: Concepts, methods, and perspectives (pp. 313-338). Mahwah, NJ: Lawrence Erlbaum Associates, Inc.
    • (2001) Setting performance standards: Concepts, methods, and perspectives , pp. 313-338
    • Jaeger, R.M.1    Craig, N.2
  • 34
    • 0010512474 scopus 로고    scopus 로고
    • (CSE Tech. Rep. No. 544). Los Angeles: University of California, National Center for Research on Evaluation, Standards, and Student Testing (CRESST)
    • Klein, D.C.D., Yarnall, L., and Glaubke, C. (2001). Using technology to assess students’ Web expertise (CSE Tech. Rep. No. 544). Los Angeles: University of California, National Center for Research on Evaluation, Standards, and Student Testing (CRESST).
    • (2001) Using technology to assess students’ Web expertise
    • Klein, D.C.D.1    Yarnall, L.2    Glaubke, C.3
  • 35
    • 80053431219 scopus 로고    scopus 로고
    • An introduction to latent semantic analysis
    • Landauer, T.K., Foltz, P.W., and Laham, D. (1998). An introduction to latent semantic analysis. Discourse Processes, 25, 259-284.
    • (1998) Discourse Processes , vol.25 , pp. 259-284
    • Landauer, T.K.1    Foltz, P.W.2    Laham, D.3
  • 37
    • 78349285263 scopus 로고
    • Complex, performance-based assessment Expectations and validation criteria
    • Linn, R.L., Baker, E.L., and Dunbar, S.B. (1991). Complex, performance-based assessment Expectations and validation criteria. Educational Researcher, 20(8), 15-21.
    • (1991) Educational Researcher , vol.20 , Issue.8 , pp. 15-21
    • Linn, R.L.1    Baker, E.L.2    Dunbar, S.B.3
  • 38
    • 0002353120 scopus 로고
    • Validity
    • R.L.Linn (Ed), 3rd ed, New York: Macmillan
    • Messick, S. (1989). Validity. In R.L.Linn (Ed), Educational measurement (3rd ed., pp. 13-103). New York: Macmillan.
    • (1989) Educational measurement , pp. 13-103
    • Messick, S.1
  • 39
    • 0032671419 scopus 로고    scopus 로고
    • A cognitive task analysis with implications for designing simulation-based performance assessment
    • Mislevy, R.J., Steinberg, L.S., Breyer, F.J., Almond, R.G., and Johnson, L. (1999). A cognitive task analysis with implications for designing simulation-based performance assessment. Computers in Human Behavior, 15, 335-3374.
    • (1999) Computers in Human Behavior , vol.15 , pp. 335-3374
    • Mislevy, R.J.1    Steinberg, L.S.2    Breyer, F.J.3    Almond, R.G.4    Johnson, L.5
  • 40
    • 3042531827 scopus 로고    scopus 로고
    • (CSE Tech. Rep. No. 538). Los Angeles: University of California, National Center for Research on Evaluation, Standards, and Student Testing (CRESST)
    • Mislevy, R, Steinberg, L., Almond, R., Breyer, F.J., and Johnson, L. (2001). Making sense of data from complex assessments (CSE Tech. Rep. No. 538). Los Angeles: University of California, National Center for Research on Evaluation, Standards, and Student Testing (CRESST).
    • (2001) Making sense of data from complex assessments
    • Mislevy, R.1    Steinberg, L.2    Almond, R.3    Breyer, F.J.4    Johnson, L.5
  • 41
    • 0033483781 scopus 로고    scopus 로고
    • Prophesying the reliability of cognitively complex assessments
    • Nichols, P.D., and Kuehl, B.J. (1999). Prophesying the reliability of cognitively complex assessments. Applied Measurement in Education, 12, 73-94.
    • (1999) Applied Measurement in Education , vol.12 , pp. 73-94
    • Nichols, P.D.1    Kuehl, B.J.2
  • 44
    • 85071573282 scopus 로고
    • Issues in intelligent computer-assisted instruction: Evaluation and measurement
    • T.B.Gutkin and S.L.Wise (Eds.), Hillsdale, NJ: Lawrence Erlbaum Associates, Inc
    • O’Neil, H.F., Jr., and Baker, E.L. (1991). Issues in intelligent computer-assisted instruction:Evaluation and measurement In T.B.Gutkin and S.L.Wise (Eds.), The computer and the decisionmaking process (pp. 199-224). Hillsdale, NJ: Lawrence Erlbaum Associates, Inc.
    • (1991) The computer and the decisionmaking process , pp. 199-224
    • O’Neil, H.F.1    Baker, E.L.2
  • 46
    • 0001596906 scopus 로고
    • The imminence of grading essays by computer
    • Page, E.B. (1966). The imminence of grading essays by computer. Phi Delta Kappan, 47, 238-243.
    • (1966) Phi Delta Kappan , vol.47 , pp. 238-243
    • Page, E.B.1
  • 47
    • 21344490742 scopus 로고
    • New computer grading of student prose, using modem concepts and software
    • Page, E.B. (1994). New computer grading of student prose, using modem concepts and software. Journal of Experimental Education, 62, 127-142.
    • (1994) Journal of Experimental Education , vol.62 , pp. 127-142
    • Page, E.B.1
  • 48
    • 0001378653 scopus 로고
    • The computer moves into essay grading: Updating the ancient test
    • Page, E.B., and Petersen, N.S. (1995). The computer moves into essay grading: Updating the ancient test. Phi Delta Kappan, 76, 561-565.
    • (1995) Phi Delta Kappan , vol.76 , pp. 561-565
    • Page, E.B.1    Petersen, N.S.2
  • 50
    • 0007206030 scopus 로고    scopus 로고
    • (CSE Tech. Rep. No. 509). Los Angeles: University of California, National Center for Research on Evaluation, Standards, and Student Testing (CRESST)
    • Rogosa, D. (1999a). Accuracy of individual scores expressed in percentile ranks: Classical test theory calculations. (CSE Tech. Rep. No. 509). Los Angeles: University of California, National Center for Research on Evaluation, Standards, and Student Testing (CRESST).
    • (1999) Accuracy of individual scores expressed in percentile ranks: Classical test theory calculations
    • Rogosa, D.1
  • 54
    • 0032647794 scopus 로고    scopus 로고
    • Computerbased performance assessments: A solution to the narrow measurement and reporting of problem solving
    • Schacter, J., Herl, H.E., Chung, G.K.W.K., Dennis, R.A., and O’Neil, H.F., Jr. (1999). Computerbased performance assessments: A solution to the narrow measurement and reporting of problem solving. Computers in Human Behavior, 15, 403-118.
    • (1999) Computers in Human Behavior , vol.15 , pp. 118-403
    • Schacter, J.1    Herl, H.E.2    Chung, G.K.W.K.3    Dennis, R.A.4    O’Neil, H.F.5


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.