-
1
-
-
85164501606
-
-
ASTM manual on presentation of data and control chart analysis, (Pub. STP15D). Philadelphia, PA Author
-
American Society for Testing and Materials. (1976). ASTM manual on presentation of data and control chart analysis (Pub. No. STP15D). Philadelphia, PA: Author.
-
(1976)
-
-
-
2
-
-
85164519291
-
-
Construct validity of e-rater in scoring TOEFL essays, (Research Report RR–07–21). Princeton, NJ Educational Testing Service
-
Attali, Y. (2007). Construct validity of e-rater in scoring TOEFL essays (Research Report No. RR–07–21). Princeton, NJ: Educational Testing Service.
-
(2007)
-
-
Attali, Y.1
-
3
-
-
85164524339
-
-
e-rater evaluation for TOEFL iBT independent essays, Unpublished manuscript
-
Attali, Y. (2008). e-rater evaluation for TOEFL iBT independent essays. Unpublished manuscript.
-
(2008)
-
-
Attali, Y.1
-
4
-
-
77956291605
-
-
. Performance of a generic approach in automated scoring., Journal of Technology, LearningAssessment, 10
-
Attali, Y., Bridgeman, B., & Trapani, C. (2010). Performance of a generic approach in automated scoring. Journal of Technology, Learning, and Assessment, 10(3), 1–16.
-
(2010)
, Issue.3
, pp. 1-16
-
-
Attali, Y.1
Bridgeman, B.2
Trapani, C.3
-
5
-
-
32544451630
-
Automated essay scoring with e-rater V.2
-
Attali, Y., & Burstein, J. (2006). Automated essay scoring with e-rater V.2. Journal of Technology, Learning, and Assessment, 4(3), 1–31.
-
(2006)
Journal of Technology, Learning, and Assessment
, vol.4
, Issue.3
, pp. 1-31
-
-
Attali, Y.1
Burstein, J.2
-
6
-
-
79961058822
-
A validity-based approach to quality control and assurance of automated scoring
-
Bejar, I. I. (2011). A validity-based approach to quality control and assurance of automated scoring. Assessment in Education: Principles, Policy and Practice, 18(3), 319–341.
-
(2011)
Assessment in Education: Principles, Policy and Practice
, vol.18
, Issue.3
, pp. 319-341
-
-
Bejar, I.I.1
-
7
-
-
21644443051
-
Automated essay scoring for nonnative English speakers
-
M. Broman Olsen, (Ed.),, Morristown, NJ, Association for Computational Linguistics
-
Burstein, J., & Chodorow, M. (1999). Automated essay scoring for nonnative English speakers. In M. Broman Olsen (Ed.), Computer mediated language assessment and evaluation in natural language processing (pp. 68–75). Morristown, NJ: Association for Computational Linguistics.
-
(1999)
Computer mediated language assessment and evaluation in natural language processing
, pp. 68-75
-
-
Burstein, J.1
Chodorow, M.2
-
8
-
-
85164523976
-
-
Beyond essay length Evaluating e–rater's performance on TOEFL essays, (Research Report RR–04–04). Princeton, NJ Educational Testing Service
-
Chodorow, M., & Burstein, J. (2004). Beyond essay length: Evaluating e–rater's performance on TOEFL essays (Research Report No. RR–04–04). Princeton, NJ: Educational Testing Service.
-
(2004)
-
-
Chodorow, M.1
Burstein, J.2
-
10
-
-
85164510156
-
-
April)., Principles for building and evaluating e–rater models, Paper presented at the annual meeting of the National Council on Measurement in Education, San Diego, CA
-
Davey, T. (2009, April). Principles for building and evaluating e–rater models. Paper presented at the annual meeting of the National Council on Measurement in Education, San Diego, CA.
-
(2009)
-
-
Davey, T.1
-
11
-
-
85164522254
-
-
Studies of a latent class signal detection model for constructed-response scoring II Incomplete and hierarchical designs, (Research Report RR-10-08). Princeton, NJ Educational Testing Service
-
DeCarlo, L. T. (2010). Studies of a latent class signal detection model for constructed-response scoring II: Incomplete and hierarchical designs (Research Report No. RR-10-08). Princeton, NJ: Educational Testing Service.
-
(2010)
-
-
DeCarlo, L.T.1
-
12
-
-
85164465793
-
-
Using rater effects models in NAEP, Unpublished manuscript
-
Donoghue, J. R., McClellan, C. A., & Gladkova, L. (2006). Using rater effects models in NAEP. Unpublished manuscript.
-
(2006)
-
-
Donoghue, J.R.1
McClellan, C.A.2
Gladkova, L.3
-
13
-
-
84988122960
-
Examining rater errors in the assessment of written composition with a many-faceted Rasch model
-
Engelhard, G., Jr. (1994). Examining rater errors in the assessment of written composition with a many-faceted Rasch model. Journal of Educational Measurement, 31, 93–112.
-
(1994)
Journal of Educational Measurement
, vol.31
, pp. 93-112
-
-
Engelhard, G.1
-
14
-
-
0347211189
-
Monitoring raters in performance assessments
-
G. Tindal, &, T. M. Haladyna, (Eds.),, Mahwah, NJ, Erlbaum
-
Engelhard, G., Jr. (2002). Monitoring raters in performance assessments. In G. Tindal & T. M. Haladyna (Eds.), Large-scale assessment programs for all examinees: Validity, technical adequacy, and implementation (pp. 261–287). Mahwah, NJ: Erlbaum.
-
(2002)
Large-scale assessment programs for all examinees: Validity, technical adequacy, and implementation
, pp. 261-287
-
-
Engelhard, G.1
-
15
-
-
85164528388
-
-
Detect cheating using statistical control methods for computer based CLEP examinations with item exposure risks, Unpublished manuscript
-
Gao, R. (2009). Detect cheating using statistical control methods for computer based CLEP examinations with item exposure risks. Unpublished manuscript.
-
(2009)
-
-
Gao, R.1
-
16
-
-
85164506651
-
-
Use of e-rater in scoring of the TOEFL iBT writing test, (Research Report RR–11–25). Princeton, NJ Educational Testing Service
-
Haberman, S. (2011). Use of e-rater in scoring of the TOEFL iBT writing test (Research Report No. RR–11–25). Princeton, NJ: Educational Testing Service.
-
(2011)
-
-
Haberman, S.1
-
17
-
-
85164463323
-
-
Measure of agreement, Unpublished manuscript
-
Haberman, S. (2012). Measure of agreement. Unpublished manuscript.
-
(2012)
-
-
Haberman, S.1
-
18
-
-
85164466277
-
-
Sample-size requirements for automated essay scoring, (Research Report RR–08–32). Princeton, NJ Educational Testing Service
-
Haberman, S., & Sinharay, S. (2008). Sample-size requirements for automated essay scoring (Research Report No. RR–08–32). Princeton, NJ: Educational Testing Service.
-
(2008)
-
-
Haberman, S.1
Sinharay, S.2
-
19
-
-
0347672323
-
Analyzing ratings and training raters
-
Kingsbury, F. A. (1922). Analyzing ratings and training raters. Journal of Personnel Research, 1, 377–383.
-
(1922)
Journal of Personnel Research
, vol.1
, pp. 377-383
-
-
Kingsbury, F.A.1
-
20
-
-
76349113647
-
Performance assessment
-
R. L. Brennan, (Ed.),, 4th ed., Westport, CT, Praeger
-
Lane, S., & Stone, C. A. (2006). Performance assessment. In R. L. Brennan (Ed.), Educational measurement (4th ed., pp. 387–431). Westport, CT: Praeger.
-
(2006)
Educational measurement
, pp. 387-431
-
-
Lane, S.1
Stone, C.A.2
-
21
-
-
85164505130
-
-
Using data mining and quality control techniques to monitor scaled scores, Manuscript submitted for publication
-
Lee, Y.-H., & von Davier, A. A. (2012). Using data mining and quality control techniques to monitor scaled scores. Manuscript submitted for publication.
-
(2012)
-
-
Lee, Y.-H.1
von Davier, A.A.2
-
23
-
-
85164492359
-
-
April)., Some small sample statistical quality control procedures for constructed response scoring in language testing, Paper presented at the annual meeting of the National Council on Measurement in Education, Denver, CO
-
Luecht, R. M. (2010, April). Some small sample statistical quality control procedures for constructed response scoring in language testing. Paper presented at the annual meeting of the National Council on Measurement in Education, Denver, CO.
-
(2010)
-
-
Luecht, R.M.1
-
25
-
-
71549124344
-
Monitoring rater performance over time: A framework for detecting differential accuracy and differential scale category use
-
Myford, C. M., & Wolfe, E. W. (2009). Monitoring rater performance over time: A framework for detecting differential accuracy and differential scale category use. Journal of Educational Measurement, 46, 371–389.
-
(2009)
Journal of Educational Measurement
, vol.46
, pp. 371-389
-
-
Myford, C.M.1
Wolfe, E.W.2
-
26
-
-
85164483057
-
-
NIST/SEMATECH e–handbook of statistical methods, Retrieved from
-
National Institute of Standards and Technology. (n.d.). NIST/SEMATECH e–handbook of statistical methods. Retrieved from http://www.itl.nist.gov/div898/handbook/pmc/section3/pmc32.htm
-
-
-
-
27
-
-
77953761690
-
Statistical process control charts for measuring and monitoring temporal consistency of ratings
-
Omar, M. H. (2010). Statistical process control charts for measuring and monitoring temporal consistency of ratings. Journal of Educational Measurement, 47(1), 18–35.
-
(2010)
Journal of Educational Measurement
, vol.47
, Issue.1
, pp. 18-35
-
-
Omar, M.H.1
-
28
-
-
0036960386
-
The hierarchical rater model for rated test items and its application to large-scale educational assessment data
-
Patz, R. J., Junker, B. W., Johnson, M. J., & Mariano, L. T. (2002). The hierarchical rater model for rated test items and its application to large-scale educational assessment data. Journal of Educational and Behavioral Statistics, 27, 341–384.
-
(2002)
Journal of Educational and Behavioral Statistics
, vol.27
, pp. 341-384
-
-
Patz, R.J.1
Junker, B.W.2
Johnson, M.J.3
Mariano, L.T.4
-
29
-
-
85164517795
-
-
Evaluation of e-rater for the GRE issue and argument prompts, (Research Report RR–12–06). Princeton, NJ Educational Testing Service
-
Ramineni, C., Trapani, C., Williamson, D. M., Davey, T., & Bridgeman, B. (2012). Evaluation of e-rater for the GRE issue and argument prompts (Research Report No. RR–12–06). Princeton, NJ: Educational Testing Service.
-
(2012)
-
-
Ramineni, C.1
Trapani, C.2
Williamson, D.M.3
Davey, T.4
Bridgeman, B.5
-
30
-
-
85164516285
-
-
April)., Understanding mean score differences between e–rater and humans for demographic–based groups in GRE, Paper presented at the annual meeting of the National Council on Measurement in Education, New Orleans, LA
-
Ramineni, C., Williamson, D., & Weng, V. (2011, April). Understanding mean score differences between e–rater and humans for demographic–based groups in GRE. Paper presented at the annual meeting of the National Council on Measurement in Education, New Orleans, LA.
-
(2011)
-
-
Ramineni, C.1
Williamson, D.2
Weng, V.3
-
32
-
-
85164468855
-
-
Proposed rater statistics for TOEFL iBT constructed response, Unpublished manuscript
-
Walker, M. (2005). Proposed rater statistics for TOEFL iBT constructed response. Unpublished manuscript.
-
(2005)
-
-
Walker, M.1
-
33
-
-
85164454399
-
-
TOEFL Writing Prompt 2 (independent prompt) health check report Human rater and e-rater, Unpublished manuscript
-
Wang, Z. (2010). TOEFL Writing Prompt 2 (independent prompt) health check report: Human rater and e-rater. Unpublished manuscript.
-
(2010)
-
-
Wang, Z.1
-
34
-
-
85164460281
-
-
Proposed procedures to monitor the performance of the human & electronic ratings for all programs, Unpublished manuscript
-
Wang, Z., & von Davier, A. A. (2010). Proposed procedures to monitor the performance of the human & electronic ratings for all programs. Unpublished manuscript.
-
(2010)
-
-
Wang, Z.1
von Davier, A.A.2
-
35
-
-
85164491659
-
-
April)., The effects of scoring designs and rater severity on students' ability estimation for constructed response items, Paper presented at the annual meeting of the National Council on Measurement in Education, New Orleans, LA
-
Wang, Z., & Yao, L. (2011, April). The effects of scoring designs and rater severity on students' ability estimation for constructed response items. Paper presented at the annual meeting of the National Council on Measurement in Education, New Orleans, LA.
-
(2011)
-
-
Wang, Z.1
Yao, L.2
-
36
-
-
85164459242
-
-
April)., Investigation of the effects of scoring designs and rater severity on students' ability estimation using different rater models, Paper presented at the annual meeting of the National Council on Measurement in Education, Vancouver, BC
-
Wang, Z., & Yao, L. (2012, April). Investigation of the effects of scoring designs and rater severity on students' ability estimation using different rater models. Paper presented at the annual meeting of the National Council on Measurement in Education, Vancouver, BC.
-
(2012)
-
-
Wang, Z.1
Yao, L.2
-
37
-
-
85164492358
-
-
March)., Effects of different training and scoring approaches on human constructed response scoring, Paper presented at the annual meeting of the National Council on Measurement in Education, New York, NY
-
Way, W. D., Vickers, D., & Nichols, P. (2008, March). Effects of different training and scoring approaches on human constructed response scoring. Paper presented at the annual meeting of the National Council on Measurement in Education, New York, NY.
-
(2008)
-
-
Way, W.D.1
Vickers, D.2
Nichols, P.3
-
38
-
-
85164463878
-
-
2nd ed., New York, NY, Author
-
Western Electronic Company. (1958). Statistical quality control (2nd ed.). New York, NY: Author.
-
(1958)
Statistical quality control
-
-
-
39
-
-
84858838088
-
A framework for evaluation and use of automated scoring
-
Williamson, D. M., Xi, X., & Breyer, F. J. (2012). A framework for evaluation and use of automated scoring. Educational Measurement: Issues and Practice, 31(1), 2–13.
-
(2012)
Educational Measurement: Issues and Practice
, vol.31
, Issue.1
, pp. 2-13
-
-
Williamson, D.M.1
Xi, X.2
Breyer, F.J.3
-
41
-
-
85164522364
-
-
April)., Detecting order effects with a multi–faceted Rasch scale model, Paper presented at the annual meeting of the National Council on Measurement in Education, Chicago, IL
-
Wolfe, E. W., & Myford, C. M. (1997, April). Detecting order effects with a multi–faceted Rasch scale model. Paper presented at the annual meeting of the National Council on Measurement in Education, Chicago, IL.
-
(1997)
-
-
Wolfe, E.W.1
Myford, C.M.2
|