-
1
-
-
32544451630
-
Automated essay scoring with e-rater V.2
-
Attali, Y., & Burstein, J. (2006). Automated essay scoring with e-rater V.2. Journal of Technology, Learning, and Assessment, 4(3). Available from http://www.jtla.org
-
(2006)
Journal of Technology, Learning, and Assessment
, vol.4
, Issue.3
-
-
Attali, Y.1
Burstein, J.2
-
3
-
-
0001841268
-
Understanding score reliability: Experiments in calibrating essay readers
-
Braun, H. I. (1988). Understanding score reliability: Experiments in calibrating essay readers. Journal of Educational Statistics, 13, 1–18
-
(1988)
Journal of Educational Statistics
, vol.13
, pp. 1-18
-
-
Braun, H.I.1
-
4
-
-
4644223855
-
-
College Board 99–3; GRE Board Research 96-12R; ETS RR 99–3. New York, NY: College Entrance Examination Board
-
Breland, H. M., Bridgeman, B., & Fowles, M. E. (1999). Writing assessment in admission to higher education: Review and framework. College Board Report No. 99–3; GRE Board Research Report No. 96-12R; ETS RR 99–3. New York, NY: College Entrance Examination Board
-
(1999)
Writing Assessment in Admission to Higher Education: Review and Framework
-
-
Breland, H.M.1
Bridgeman, B.2
Fowles, M.E.3
-
6
-
-
0030023350
-
Dependence of weighted kappa coefficients on the number of categories
-
Brenner, H., & Kleissch, U. (1996). Dependence of weighted kappa coefficients on the number of categories. Epidemiology, 7, 199–202
-
(1996)
Epidemiology
, vol.7
, pp. 199-202
-
-
Brenner, H.1
Kleissch, U.2
-
7
-
-
84855958640
-
Comparison of human and machine scoring of essays: Differences by gender, ethnicity, and country
-
Bridgeman, B., Trapani, C., & Attali, Y. (2012). Comparison of human and machine scoring of essays: Differences by gender, ethnicity, and country. Applied Measurement in Education, 25, 27–40
-
(2012)
Applied Measurement in Education
, vol.25
, pp. 27-40
-
-
Bridgeman, B.1
Trapani, C.2
Attali, Y.3
-
8
-
-
84858850841
-
The question of validity of automated essay scores and differentially valued evidence
-
New Orleans, LA
-
Bridgeman, B., Trapani, C., & Williamson, D. (2011). The question of validity of automated essay scores and differentially valued evidence. Paper presented at the National Council on Measurement in Education, New Orleans, LA
-
(2011)
Paper Presented at the National Council on Measurement in Education
-
-
Bridgeman, B.1
Trapani, C.2
Williamson, D.3
-
9
-
-
58149412516
-
Weighted kappa: Nominal scale agreement with provision for scaled disagreement of partial credit
-
Cohen, J. A. (1968). Weighted kappa: Nominal scale agreement with provision for scaled disagreement of partial credit. Psychological Bulletin, 70, 213–220
-
(1968)
Psychological Bulletin
, vol.70
, pp. 213-220
-
-
Cohen, J.A.1
-
10
-
-
0034195156
-
The stability of rater severity in large-scale assessment programs
-
Congdon, P. J., & McQueen, J. (2000). The stability of rater severity in large-scale assessment programs. Journal of Educational Measurement, 37, 163–178
-
(2000)
Journal of Educational Measurement
, vol.37
, pp. 163-178
-
-
Congdon, P.J.1
McQueen, J.2
-
11
-
-
84977690570
-
Theory of generalizability: A liberation of reliability theory
-
Cronbach, L. J., Nageswari, R., & Gleser, G. C. (1963). Theory of generalizability: A liberation of reliability theory. The British Journal of Statistical Psychology, 16(2), 137–163
-
(1963)
The British Journal of Statistical Psychology
, vol.16
, Issue.2
, pp. 137-163
-
-
Cronbach, L.J.1
Nageswari, R.2
Gleser, G.C.3
-
12
-
-
80053241573
-
A hierarchical rater model for constructed responses, with a signal detection rater model
-
DeCarlo, J. T., Kim, Y. K., & Johnson, M. S. (2011). A hierarchical rater model for constructed responses, with a signal detection rater model. Journal of Educational Measurement, 48, 333–356
-
(2011)
Journal of Educational Measurement
, vol.48
, pp. 333-356
-
-
Decarlo, J.T.1
Kim, Y.K.2
Johnson, M.S.3
-
13
-
-
33747363986
-
An overview of automated scoring of essays. The Journal of Technology
-
Dikli, S. (2006). An overview of automated scoring of essays. The Journal of Technology, Learning, and Assessment, 5(1), http://ejournals.bc.edu/ojs/index.php/jtla/article/view/1640/1489
-
(2006)
Learning, and Assessment
, vol.5
, Issue.1
-
-
Dikli, S.1
-
14
-
-
84988122960
-
Examining rater errors in the assessment of written composition with a many-faceted Rasch model
-
Engelhard, G., Jr. (1994). Examining rater errors in the assessment of written composition with a many-faceted Rasch model. Journal of Educational Measurement, 31(2), 93–112
-
(1994)
Journal of Educational Measurement
, vol.31
, Issue.2
, pp. 93-112
-
-
Engelhard, G.1
-
15
-
-
85062062645
-
-
GRE analytical writing scoring guide (2011), from http://www.ets.org/gre/revised_general/prepare/analytical_writing/issue/scoring_guide
-
(2011)
-
-
-
16
-
-
84055198279
-
Rater effects on essay scoring: A multilevel analysis of severity drift, central tendency, and rater experience
-
Leckie, G., & Baird, J. (2011). Rater effects on essay scoring: A multilevel analysis of severity drift, central tendency, and rater experience. Journal of Educational Measurement, 48, 399–418
-
(2011)
Journal of Educational Measurement
, vol.48
, pp. 399-418
-
-
Leckie, G.1
Baird, J.2
-
18
-
-
84990328733
-
Assessment criteria in a large-scale writing test: What do they really mean to the raters?
-
Lumley, T. (2002). Assessment criteria in a large-scale writing test: What do they really mean to the raters? Language Testing, 19, 246–276
-
(2002)
Language Testing
, vol.19
, pp. 246-276
-
-
Lumley, T.1
-
20
-
-
84976923244
-
A generalized Partial Credit Model: Applications of an EM algorithm
-
Murakim, E. (1992). A generalized Partial Credit Model: Applications of an EM algorithm. Applied Psychological Measurement, 16, 159–176
-
(1992)
Applied Psychological Measurement
, vol.16
, pp. 159-176
-
-
Murakim, E.1
-
21
-
-
1842843697
-
Detecting and measuring rater effects using many-facet Rasch measurement: Part II
-
Myford, C. M., & Wolfe, E. W. (2004). Detecting and measuring rater effects using many-facet Rasch measurement: Part II. Journal of Applied Measurement, 5, 189–227
-
(2004)
Journal of Applied Measurement
, vol.5
, pp. 189-227
-
-
Myford, C.M.1
Wolfe, E.W.2
-
22
-
-
71549124344
-
Monitoring rater performance over time: A framework for detecting differential accuracy and differential scale category use
-
Myford, C. M., & Wolfe, E. W. (2009). Monitoring rater performance over time: A framework for detecting differential accuracy and differential scale category use. Journal of Educational Measurement, 46, 371–389
-
(2009)
Journal of Educational Measurement
, vol.46
, pp. 371-389
-
-
Myford, C.M.1
Wolfe, E.W.2
-
23
-
-
68049121626
-
Factor structure of the TOEFL Internet-based Test (TOEFL iBT)
-
Sawaki, Y., Stricker, L., & Oranje, A. (2009). Factor structure of the TOEFL Internet-based Test (TOEFL iBT). Language Testing, 26, 5–30
-
(2009)
Language Testing
, vol.26
, pp. 5-30
-
-
Sawaki, Y.1
Stricker, L.2
Oranje, A.3
-
24
-
-
1842431905
-
A note on the interpretation of weighted kappa and its relations to other rater agreement statistics for metric scales
-
Schuster, C. (2004). A note on the interpretation of weighted kappa and its relations to other rater agreement statistics for metric scales. Educational and Psychological Measurement, 64, 243–253
-
(2004)
Educational and Psychological Measurement
, vol.64
, pp. 243-253
-
-
Schuster, C.1
-
25
-
-
85142554863
-
-
Introduction. In M. D. Shermis & J. Burstein (Eds.), Mahwah, NJ: Lawrence Erlbaum Associates, Inc
-
Shermis, M. D., & Burstein, J. (2003). Introduction. In M. D. Shermis & J. Burstein (Eds.), Automated essay scoring: A cross-disciplinary perspective. Mahwah, NJ: Lawrence Erlbaum Associates, Inc
-
(2003)
Automated Essay Scoring: A Cross-Disciplinary Perspective
-
-
Shermis, M.D.1
Burstein, J.2
-
27
-
-
33646364669
-
Identifying rater effects using latent trait models
-
Wolfe, E. W. (2004). Identifying rater effects using latent trait models. Psychological Science, 46, 35–51
-
(2004)
Psychological Science
, vol.46
, pp. 35-51
-
-
Wolfe, E.W.1
-
28
-
-
77950018013
-
Uncovering rater’s cognitive processing and focus using think-aloud protocols
-
Wolfe, E. W. (2005). Uncovering rater’s cognitive processing and focus using think-aloud protocols. Journal of Writing Assessment, 2, 37–56
-
(2005)
Journal of Writing Assessment
, vol.2
, pp. 37-56
-
-
Wolfe, E.W.1
|