메뉴 건너뛰기




Volumn 25, Issue 2, 2008, Pages 155-185

Rater types in writing performance assessments: A classification approach to rater variability

Author keywords

Classification; Large scale performance assessment; Rater cognition; Rater monitoring; Rater variability

Indexed keywords


EID: 55249090887     PISSN: 02655322     EISSN: 14770946     Source Type: Journal    
DOI: 10.1177/0265532207086780     Document Type: Article
Times cited : (190)

References (70)
  • 1
    • 84970313125 scopus 로고
    • Investigating variability in tasks and rater judgements in a performance test of foreign language speaking
    • Bachman, L.F., Lynch, B.K. & Mason, M. (1995). Investigating variability in tasks and rater judgements in a performance test of foreign language speaking. Language Testing, 12, 238-57.
    • (1995) Language Testing , vol.12 , pp. 238-257
    • Bachman, L.F.1    Lynch, B.K.2    Mason, M.3
  • 3
    • 33646346235 scopus 로고    scopus 로고
    • The impact of training on rater variability
    • Barrett, S. (2001). The impact of training on rater variability. International Education Journal, 2, 49-58.
    • (2001) International Education Journal , vol.2 , pp. 49-58
    • Barrett, S.1
  • 4
    • 84981610947 scopus 로고
    • Do English and ESL faculties rate writing samples differently?
    • Brown, J.D. (1991). Do English and ESL faculties rate writing samples differently? TESOL Quarterly, 25, 587-603.
    • (1991) TESOL Quarterly , vol.25 , pp. 587-603
    • Brown, J.D.1
  • 5
    • 6444229110 scopus 로고    scopus 로고
    • Recurrence properties in two-mode hierarchical clustering
    • In R. Decker & W. Gaul, editors Berlin: Springer-Verlag
    • Castillo, W. & Trejos, J. (2000). Recurrence properties in two-mode hierarchical clustering. In R. Decker & W. Gaul, editors, Classification and information processing at the turn of the millennium (pp. 68-73). Berlin: Springer-Verlag.
    • (2000) Classification and Information Processing at the Turn of the Millennium , pp. 68-73
    • Castillo, W.1    Trejos, J.2
  • 6
    • 0034195156 scopus 로고    scopus 로고
    • The stability of rater severity in large-scale assessment programs
    • Congdon, P.J. & McQueen, J. (2000). The stability of rater severity in large-scale assessment programs. Journal of Educational Measurement, 37, 163-178.
    • (2000) Journal of Educational Measurement , vol.37 , pp. 163-178
    • Congdon, P.J.1    McQueen, J.2
  • 8
    • 84930559584 scopus 로고
    • Expertise in evaluating second language compositions
    • Cumming, A. (1990). Expertise in evaluating second language compositions. Language Testing, 7, 31-51.
    • (1990) Language Testing , vol.7 , pp. 31-51
    • Cumming, A.1
  • 10
    • 84937382152 scopus 로고    scopus 로고
    • Decision making while rating ESL/EFL writing tasks: A descriptive framework
    • Cumming, A., Kantor, R. & Powers, D.E. (2002). Decision making while rating ESL/EFL writing tasks: A descriptive framework. Modern Language Journal, 86, 67-96.
    • (2002) Modern Language Journal , vol.86 , pp. 67-96
    • Cumming, A.1    Kantor, R.2    Powers, D.E.3
  • 11
    • 17244372792 scopus 로고    scopus 로고
    • A model of rater behavior in essay grading based on signal detection theory
    • DeCarlo, L.T. (2005). A model of rater behavior in essay grading based on signal detection theory. Journal of Educational Measurement, 42, 53-76.
    • (2005) Journal of Educational Measurement , vol.42 , pp. 53-76
    • DeCarlo, L.T.1
  • 12
    • 0043265942 scopus 로고    scopus 로고
    • Writing assessment: Raters' elaboration of the rating task
    • DeRemer, M.L. (1998). Writing assessment: Raters' elaboration of the rating task. Assessing Writing, 5, 7-29.
    • (1998) Assessing Writing , vol.5 , pp. 7-29
    • DeRemer, M.L.1
  • 14
    • 21744441418 scopus 로고    scopus 로고
    • Qualitätssicherung beim TestDaF: Konzepte, Methoden, Ergebnisse
    • Eckes, T. (2003). Qualitätssicherung beim TestDaF: Konzepte, Methoden, Ergebnisse [Assuring the quality of the TestDaF: Concepts, methods, results]. Fremdsprachen und Hochschule, 69, 43-68.
    • (2003) Fremdsprachen Und Hochschule , vol.69 , pp. 43-68
    • Eckes, T.1
  • 15
    • 2442481715 scopus 로고    scopus 로고
    • Beurteilerübereinstimmung und Beurteilerstrenge: Eine Multifacetten-Rasch-Analyse von Leistungsbeurteilungen im "Test Deutsch als Fremdsprache" (TestDaF)
    • Eckes, T. (2004). Beurteilerübereinstimmung und Beurteilerstrenge: Eine Multifacetten-Rasch-Analyse von Leistungsbeurteilungen im "Test Deutsch als Fremdsprache" (TestDaF) [Rater agreement and rater severity: A many-facet Rasch analysis of performance assessments in the "Test Deutsch als Fremdsprache " (TestDaF)]. Diagnostica, 50, 65-77.
    • (2004) Diagnostica , vol.50 , pp. 65-77
    • Eckes, T.1
  • 17
    • 33745756490 scopus 로고    scopus 로고
    • Examining rater effects in TestDaF writing and speaking performance assessments: A many-facet Rasch analysis
    • Eckes, T. (2005b). Examining rater effects in TestDaF writing and speaking performance assessments: A many-facet Rasch analysis. Language Assessment Quarterly, 2, 197-221.
    • (2005) Language Assessment Quarterly , vol.2 , pp. 197-221
    • Eckes, T.1
  • 18
    • 55249083587 scopus 로고    scopus 로고
    • Raters' perceptions of scoring criteria in writing and speaking performance assessments
    • Paper presented at the Melbourne, Australia
    • Eckes, T. (2006). Raters' perceptions of scoring criteria in writing and speaking performance assessments. Paper presented at the 28th Language Testing Research Colloquium (LTRC), Melbourne, Australia.
    • (2006) 28th Language Testing Research Colloquium (LTRC)
    • Eckes, T.1
  • 19
    • 21744444510 scopus 로고    scopus 로고
    • Progress and problems in reforming public language examinations in Europe: Cameos from the Baltic States, Greece, Hungary, Poland, Slovenia, France, and Germany
    • Eckes, T., Ellis, M., Kalnberzina, V., Pižorn, K., Springer, C., Szollás, K., & Tsagari, C. (2005). Progress and problems in reforming public language examinations in Europe: Cameos from the Baltic States, Greece, Hungary, Poland, Slovenia, France, and Germany. Language Testing, 22, 355-377.
    • (2005) Language Testing , vol.22 , pp. 355-377
    • Eckes, T.1    Ellis, M.2    Kalnberzina, V.3    Pižorn, K.4    Springer, C.5    Szollás, K.6    Tsagari, C.7
  • 21
    • 21144464635 scopus 로고
    • An error variance approach to two-mode hierarchical clustering
    • Eckes, T. & Orlik, P. (1993). An error variance approach to two-mode hierarchical clustering. Journal of Classification, 10, 51-74.
    • (1993) Journal of Classification , vol.10 , pp. 51-74
    • Eckes, T.1    Orlik, P.2
  • 22
    • 0002579272 scopus 로고
    • The element of chance in competitive examinations
    • 460-475
    • Edgeworth, F.Y. (1890). The element of chance in competitive examinations. Journal of the Royal Statistical Society, 53, 460-475, 644-663.
    • (1890) Journal of the Royal Statistical Society , vol.53 , pp. 644-663
    • Edgeworth, F.Y.1
  • 24
    • 84988122960 scopus 로고
    • Examining rater errors in the assessment of written composition with a many-faceted Rasch model
    • Engelhard, G., Jr. (1994). Examining rater errors in the assessment of written composition with a many-faceted Rasch model. Journal of Educational Measurement, 31, 93-112.
    • (1994) Journal of Educational Measurement , vol.31 , pp. 93-112
    • Engelhard Jr., G.1
  • 28
    • 0043048394 scopus 로고
    • How characteristics of student essays influence teachers' evaluation
    • Freedman, S.W. (1979). How characteristics of student essays influence teachers' evaluation. Journal of Educational Psychology, 71, 328-338.
    • (1979) Journal of Educational Psychology , vol.71 , pp. 328-338
    • Freedman, S.W.1
  • 31
    • 85056009258 scopus 로고
    • A structural theory for intergroup beliefs and action
    • Guttman, L. (1959). A structural theory for intergroup beliefs and action. American Sociological Review, 24, 318-328.
    • (1959) American Sociological Review , vol.24 , pp. 318-328
    • Guttman, L.1
  • 32
    • 33749074965 scopus 로고    scopus 로고
    • Writing teachers as assessors of writing
    • In B. Kroll, editor Cambridge: Cambridge University Press
    • Hamp-Lyons, L. (2003). Writing teachers as assessors of writing. In B. Kroll, editor, Exploring the dynamics of second language writing (pp. 162-189). Cambridge: Cambridge University Press.
    • (2003) Exploring the Dynamics of Second Language Writing , pp. 162-189
    • Hamp-Lyons, L.1
  • 33
    • 84982740847 scopus 로고
    • Native and nonnative speakers' pragmatic interpretations of English texts
    • Hinkel, E. (1994). Native and nonnative speakers' pragmatic interpretations of English texts. TESOL Quarterly, 28, 353-376.
    • (1994) TESOL Quarterly , vol.28 , pp. 353-376
    • Hinkel, E.1
  • 34
    • 0041029763 scopus 로고    scopus 로고
    • Magnitude and moderators of bias in observer ratings: A meta-analysis
    • Hoyt, W.T. & Kerns, M.-D. (1999). Magnitude and moderators of bias in observer ratings: A meta-analysis. Psychological Methods, 4, 403-424.
    • (1999) Psychological Methods , vol.4 , pp. 403-424
    • Hoyt, W.T.1    Kerns, M.-D.2
  • 35
    • 0041855715 scopus 로고
    • The influence of holistic scoring procedures on reading and rating student essays
    • In M. M. Williamson & B. A. Huot, editors Cresskill, NJ: Hampton Press
    • Huot, B.A. (1993). The influence of holistic scoring procedures on reading and rating student essays. In M. M. Williamson & B. A. Huot, editors, Validating holistic scoring for writing assessment: Theoretical and empirical foundations (pp. 206-236). Cresskill, NJ: Hampton Press.
    • (1993) Validating Holistic Scoring for Writing Assessment: Theoretical and Empirical Foundations , pp. 206-236
    • Huot, B.A.1
  • 36
    • 85047683691 scopus 로고    scopus 로고
    • Assessor training strategies and their effects on accuracy, interrater reliability, and discriminant validity
    • Lievens, F. (2001). Assessor training strategies and their effects on accuracy, interrater reliability, and discriminant validity. Journal of Applied Psychology, 86, 255-264.
    • (2001) Journal of Applied Psychology , vol.86 , pp. 255-264
    • Lievens, F.1
  • 37
    • 0442315468 scopus 로고    scopus 로고
    • What do infit and outfit, mean-square and standardized mean?
    • Linacre, J.M. (2002). What do infit and outfit, mean-square and standardized mean? Rasch Measurement Transactions, 16(2), 878.
    • (2002) Rasch Measurement Transactions , vol.16 , Issue.2 , pp. 878
    • Linacre, J.M.1
  • 38
    • 33746925494 scopus 로고    scopus 로고
    • Optimizing rating scale category effectiveness
    • In E. V. Smith & R. M. Smith, editors Maple Grove, MN: JAM Press
    • Linacre, J.M. (2004). Optimizing rating scale category effectiveness. In E. V. Smith & R. M. Smith, editors, Introduction to Rasch measurement (pp. 258-278). Maple Grove, MN: JAM Press.
    • (2004) Introduction to Rasch Measurement , pp. 258-278
    • Linacre, J.M.1
  • 40
    • 84990328733 scopus 로고    scopus 로고
    • Assessment criteria in a large-scale writing test: What do they really mean to the raters?
    • Lumley, T. (2002). Assessment criteria in a large-scale writing test: What do they really mean to the raters? Language Testing, 19, 246-276.
    • (2002) Language Testing , vol.19 , pp. 246-276
    • Lumley, T.1
  • 43
    • 84965511141 scopus 로고
    • Rater characteristics and rater bias: Implications for training
    • Lumley, T. & McNamara, T.F. (1995). Rater characteristics and rater bias: Implications for training. Language Testing, 12, 54-71.
    • (1995) Language Testing , vol.12 , pp. 54-71
    • Lumley, T.1    McNamara, T.F.2
  • 44
    • 84930560950 scopus 로고
    • Item response theory and the validation of an ESP test for health professionals
    • McNamara, T.F. (1990). Item response theory and the validation of an ESP test for health professionals. Language Testing, 7, 52-75.
    • (1990) Language Testing , vol.7 , pp. 52-75
    • McNamara, T.F.1
  • 47
    • 21844484250 scopus 로고
    • Additive two-mode clustering: The error-variance approach revisited
    • Mirkin, B., Arabie, P. & Hubert, L.J. (1995). Additive two-mode clustering: The error-variance approach revisited. Journal of Classification, 12, 243-263.
    • (1995) Journal of Classification , vol.12 , pp. 243-263
    • Mirkin, B.1    Arabie, P.2    Hubert, L.J.3
  • 49
    • 0346335427 scopus 로고    scopus 로고
    • Detecting and measuring rater effects using many-facet Rasch measurement: Part I
    • Myford, C.M. & Wolfe, E.W. (2003). Detecting and measuring rater effects using many-facet Rasch measurement: Part I. Journal of Applied Measurement, 4, 386-422.
    • (2003) Journal of Applied Measurement , vol.4 , pp. 386-422
    • Myford, C.M.1    Wolfe, E.W.2
  • 50
    • 1842843697 scopus 로고    scopus 로고
    • Detecting and measuring rater effects using many-facet Rasch measurement: Part II
    • Myford, C.M. & Wolfe, E.W. (2004). Detecting and measuring rater effects using many-facet Rasch measurement: Part II. Journal of Applied Measurement, 5, 189-227.
    • (2004) Journal of Applied Measurement , vol.5 , pp. 189-227
    • Myford, C.M.1    Wolfe, E.W.2
  • 53
    • 0002459312 scopus 로고
    • The development of training programs to increase accuracy with different rating tasks
    • Pulakos, E.D. (1986). The development of training programs to increase accuracy with different rating tasks. Organizational Behavior and Human Decision Processes 38, 76-91.
    • (1986) Organizational Behavior and Human Decision Processes , vol.38 , pp. 76-91
    • Pulakos, E.D.1
  • 54
    • 55249123647 scopus 로고    scopus 로고
    • Revisiting raters and ratings in oral language assessment
    • In C. Elder et al., editors Cambridge: Cambridge University Press
    • Reed, D.J. and Cohen, A.D. (2001). Revisiting raters and ratings in oral language assessment. In C. Elder et al., editors, Experimenting with uncertainty: Essays in honour of Alan Davie (pp. 82-96). Cambridge: Cambridge University Press.
    • (2001) Experimenting With Uncertainty: Essays in Honour of Alan Davie , pp. 82-96
    • Reed, D.J.1    Cohen, A.D.2
  • 55
    • 55249092130 scopus 로고    scopus 로고
    • Validation of holistic scoring for ESL writing assessment: How raters evaluate compositions
    • In J. J. Kunnan, editor Cambridge: Cambridge University Press
    • Sakyi, A.A. (2000). Validation of holistic scoring for ESL writing assessment: How raters evaluate compositions. In J. J. Kunnan, editor, Fairness and validation in language assessment (pp. 129-152). Cambridge: Cambridge University Press.
    • (2000) Fairness and Validation in Language Assessment , pp. 129-152
    • Sakyi, A.A.1
  • 56
    • 13744250460 scopus 로고    scopus 로고
    • Generalizability of writing scores: An application of structural equation modeling
    • Schoonen, R. (2005). Generalizability of writing scores: An application of structural equation modeling. Language Testing, 22, 1-30.
    • (2005) Language Testing , vol.22 , pp. 1-30
    • Schoonen, R.1
  • 57
    • 13744253126 scopus 로고    scopus 로고
    • The assessment of writing ability: Expert readers versus lay readers
    • Schoonen, R., Vergeer, M. & Eiting, M. (1997). The assessment of writing ability: Expert readers versus lay readers. Language Testing, 14, 157-184.
    • (1997) Language Testing , vol.14 , pp. 157-184
    • Schoonen, R.1    Vergeer, M.2    Eiting, M.3
  • 58
    • 84937284931 scopus 로고
    • Performance assessment in language testing
    • Shohamy, E. (1995). Performance assessment in language testing. Annual Review of Applied Linguistics, 15, 188-211.
    • (1995) Annual Review of Applied Linguistics , vol.15 , pp. 188-211
    • Shohamy, E.1
  • 59
    • 41149180093 scopus 로고    scopus 로고
    • Rater judgments in the direct assessment of competency-based second language writing ability
    • In G. Brindley, editor Sydney: National Centre for English Language Teaching and Research, Macquarie University
    • Smith, D. (2000). Rater judgments in the direct assessment of competency-based second language writing ability. In G. Brindley, editor, Studies in immigrant English language assessment, Vol. 1, (pp. 159-189). Sydney: National Centre for English Language Teaching and Research, Macquarie University.
    • (2000) Studies in Immigrant English Language Assessment , vol.1 , pp. 159-189
    • Smith, D.1
  • 60
    • 33744925023 scopus 로고    scopus 로고
    • Fit analysis in latent trait measurement models
    • In E. V. Smith & R. M. Smith, editors Maple Grove, MN: JAM Press
    • Smith, R.M. (2004). Fit analysis in latent trait measurement models. In E. V. Smith & R. M. Smith, editors, Introduction to Rasch measurement (pp. 73-92). Maple Grove, MN: JAM Press.
    • (2004) Introduction to Rasch Measurement , pp. 73-92
    • Smith, R.M.1
  • 61
    • 0000666371 scopus 로고
    • The extension of factor analysis to three-dimensional matrices
    • In N. Frederiksen & H. Gulliksen, editors New York: Holt, Rinehart and Winston
    • Tucker, L.R. (1964). The extension of factor analysis to three-dimensional matrices. In N. Frederiksen & H. Gulliksen, editors, Contributions to mathematical psychology (pp. 109-127). New York: Holt, Rinehart and Winston.
    • (1964) Contributions to Mathematical Psychology , pp. 109-127
    • Tucker, L.R.1
  • 63
    • 0042857697 scopus 로고
    • Holistic assessment: What goes on in the rater's mind?
    • In L. Hamp-Lyons, editor Norwood, NJ: Ablex
    • Vaughan, C. (1991). Holistic assessment: What goes on in the rater's mind? In L. Hamp-Lyons, editor, Assessing second language writing in academic contexts (pp. 111-125). Norwood, NJ: Ablex.
    • (1991) Assessing Second Language Writing in Academic Contexts , pp. 111-125
    • Vaughan, C.1
  • 64
    • 84965371977 scopus 로고
    • Effects of training on raters of ESL compositions
    • Weigle, S.C. (1994). Effects of training on raters of ESL compositions. Language Testing, 11, 197-223.
    • (1994) Language Testing , vol.11 , pp. 197-223
    • Weigle, S.C.1
  • 65
    • 0002422895 scopus 로고    scopus 로고
    • Using FACETS to model rater training effects
    • Weigle, S.C. (1998). Using FACETS to model rater training effects. Language Testing, 15, 263-287.
    • (1998) Language Testing , vol.15 , pp. 263-287
    • Weigle, S.C.1
  • 66
    • 0043206862 scopus 로고    scopus 로고
    • Investigating rater/prompt interactions in writing assessment: Quantitative and qualitative approaches
    • Weigle, S.C. (1999). Investigating rater/prompt interactions in writing assessment: Quantitative and qualitative approaches. Assessing Writing, 6, 145-178.
    • (1999) Assessing Writing , vol.6 , pp. 145-178
    • Weigle, S.C.1
  • 67
    • 0242339347 scopus 로고    scopus 로고
    • Cambridge: Cambridge University Press
    • Weigle, S.C. (2002). Assessing writing. Cambridge: Cambridge University Press.
    • (2002) Assessing Writing
    • Weigle, S.C.1
  • 69
    • 0041855688 scopus 로고    scopus 로고
    • The relationship between essay reading style and scoring proficiency in a psychometric scoring system
    • Wolfe, E.W. (1997). The relationship between essay reading style and scoring proficiency in a psychometric scoring system. Assessing Writing, 4, 83-106.
    • (1997) Assessing Writing , vol.4 , pp. 83-106
    • Wolfe, E.W.1
  • 70
    • 33749847596 scopus 로고    scopus 로고
    • Cognitive differences in proficient and nonproficient essay scorers
    • Wolfe, E.W., Kao, C.-W. & Ranney, M. (1998). Cognitive differences in proficient and nonproficient essay scorers. Written Communication, 15, 465-492.
    • (1998) Written Communication , vol.15 , pp. 465-492
    • Wolfe, E.W.1    Kao, C.-W.2    Ranney, M.3


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.