SCOPUS 정보 검색 플랫폼

Machine Translation

Volumn 23, Issue 2-3, 2009, Pages 71-103

The NIST 2008 metrics for machine translation challenge-overview, methodology, metrics, and results

(4) Przybocki, Mark a Peterson, Kay a Bronsart, Sébastien a Sanders, Gregory a

a NATIONAL INSTITUTE OF STANDARDS AND TECHNOLOGY (United States)

Author keywords

Automated MT metrics; Metric evaluation; MetricsMATR

Indexed keywords

CORRELATION STATISTICS; GENERAL CLASS; MACHINE TRANSLATIONS; METRIC EVALUATION; SYSTEM LEVELS; TEST DATA;

CLUSTER ANALYSIS; INFORMATION THEORY; REGRESSION ANALYSIS; SPEECH TRANSMISSION; TEST FACILITIES;

AUTOMATION;

EID: 77954761816 PISSN: 09226567 EISSN: None Source Type: Journal
DOI: 10.1007/s10590-009-9065-6 Document Type: Article

Times cited : (45)

References (31)

1
- 77954762781
- Extending the BLEU MT evaluation method with frequency weightings. Proceedings of the 42nd Meeting of the Association for Computational Linguistics (ACL'04)
- Barcelona, Spain
- Babych B, Hartley A (2004) Extending the BLEU MT evaluation method with frequency weightings. Proceedings of the 42nd Meeting of the Association for Computational Linguistics (ACL'04). Association for Computational Linguistics, Barcelona, Spain
- (2004) Association for Computational Linguistics
- Babych, B.¹ Hartley, A.²

2
- 77954758235
- (Meta-) evaluation of machine translation. Proceedings ofthe Second WorkshoponStatistical Machine Translation
- Callison-Burch C, Fordyce C, Koehn P, Monz C, Schroeder J (2007) (Meta-) evaluation of machine translation. Proceedings ofthe Second WorkshoponStatistical Machine Translation. Prague Czech Republic, Association for Computational Linguistics
- (2007) Prague Czech Republic, Association for Computational Linguistics
- Callison-Burch, C.¹ Fordyce, C.² Koehn, P.³ Monz, C.⁴ Schroeder, J.⁵

3
- 85018089028
- Further meta-evaluation of machine translation
- Association for Computational Linguistics, Columbus OH
- Callison-Burch C, Fordyce C, Koehn P, Monz C, Schroeder J (2008) Further meta-evaluation of machine translation. Proceedings of the Third Workshop on Statistical Machine Translation (WMT08). Association for Computational Linguistics, Columbus OH
- (2008) Proceedings of the Third Workshop on Statistical Machine Translation (WMT08)
- Callison-Burch, C.¹ Fordyce, C.² Koehn, P.³ Monz, C.⁴ Schroeder, J.⁵

4
- 84893361786
- Re-evaluating the role of BLEU in machine translation research
- Association for Computational Linguistics, Trento, Italy
- Callison-Burch C, Osborne M, Koehn P (2006) Re-evaluating the role of BLEU in machine translation research. Proceedings of the 11th Conference of the European Chapter of the Association for Computational Linguistics. Association for Computational Linguistics, Trento, Italy
- (2006) Proceedings of the 11th Conference of the European Chapter of the Association for Computational Linguistics
- Callison-Burch, C.¹ Osborne, M.² Koehn, P.³

5
- 84863365416
- 11, 001 New features for statistical machine translation
- Association for Computational Linguistics, Boulder, CO
- Chiang D, Knight K, Wang W (2009) 11, 001 New features for statistical machine translation. Proceedings of The Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL-HLT 2009). Association for Computational Linguistics, Boulder, CO
- (2009) Proceedings of the Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL-HLT 2009)
- Chiang, D.¹ Knight, K.² Wang, W.³

6
- 80053364712
- Decomposability of translation metrics for improved evaluation and efficient algorithms
- Association for Computational Linguistics, Honolulu, Hawaii
- Chiang D, DeNeefe S, Chan YS, Ng HT (2008) Decomposability of translation metrics for improved evaluation and efficient algorithms. Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Honolulu, Hawaii
- (2008) Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing
- Chiang, D.¹ DeNeefe, S.² Chan, Y.S.³ Ng, H.T.⁴

7
- 70349246101
- Evaluating message understanding systems: An analysis of the third message understanding conference (MUC-3)
- Chinchor N, Hirschman L, Lewis DD (1993) Evaluating message understanding systems: an analysis of the third message understanding conference (MUC-3). Computational Linguistics 19 (3)
- (1993) Computational Linguistics , vol.19 , Issue.3
- Chinchor, N.¹ Hirschman, L.² Lewis, D.D.³

8
- 84973587732
- A coefficient of agreement for nominal scales
- Cohen J (1960) A coefficient of agreement for nominal scales. Educ Psychol Meas 37-46
- (1960) Educ Psychol. Meas. , pp. 37-46
- Cohen, J.¹

9
- 84857783669
- Normalization for automated metrics: English and Arabic speech translation
- Association for Machine Translation in the Americas, Ottawa, ON, Canada
- Condon S, Sanders GA, Parvaz D, Rubenstein A, Doran C, Aberdeen J, Oshika B (2009) Normalization for automated metrics: English and Arabic speech translation. Proceedings of MT Summit XII. Association for Machine Translation in the Americas, Ottawa, ON, Canada
- (2009) Proceedings of MT Summit , vol.12
- Condon, S.¹ Sanders, G.A.² Parvaz, D.³ Rubenstein, A.⁴ Doran, C.⁵ Aberdeen, J.⁶ Oshika, B.⁷

10
- 33748682610
- Correlating automated and human assessments of machine translation quality
- Association for Machine Translation in the Americas, New Orleans, LA
- Coughlin D (2003) Correlating automated and human assessments of machine translation quality. Proceedings of MT Summit IX. Association for Machine Translation in the Americas, New Orleans, LA
- (2003) Proceedings of MT Summit IX
- Coughlin, D.¹

11
- 15744403523
- Automatic evaluation of machine translation quality using n-gram co-occurrence statistics
- Morgan Kaufmann, San Diego, CA
- Doddington G (2002) Automatic evaluation of machine translation quality using n-gram co-occurrence statistics. Proceedings of the Second International Conference on Human Language Technology Research, Morgan Kaufmann, San Diego, CA
- (2002) Proceedings of the Second International Conference on Human Language Technology Research
- Doddington, G.¹

12
- 15344349756
- Wordnet: An electronic lexical database
- Fellbaum C (1998) Wordnet: an electronic lexical database. Bradford Books
- (1998) Bradford Books
- Fellbaum, C.¹

13
- 3343019470
- Measuring nominal scale agreement among many raters
- Fleiss JL (1971) Measuring nominal scale agreement among many raters. Psychol Bull 76
- (1971) Psychol. Bull. , vol.76
- Fleiss, J.L.¹

14
- 33645066726
- Large sample standard errors of Kappa and weighted Kappa
- Fleiss JL, Cohen J, Everitt BS (1969) Large sample standard errors of Kappa and weighted Kappa. Psychol Bull 72
- (1969) Psychol. Bull. , vol.72
- Fleiss, J.L.¹ Cohen, J.² Everitt, B.S.³

15
- 0000667627
- Estimates of location based on tank tests
- Hodges JL, Lehmann EL (1963) Estimates of location based on tank tests. Ann Math Stat 34
- (1963) Ann. Math. Stat , vol.34
- Hodges, J.L.¹ Lehmann, E.L.²

16
- 85037543668
- ILR-based MT comprehension test with multi-level questions
- Association for Computational Linguistics, Rochester, NY
- Jones D, Herzog M, Ibrahim H, Jairam A, Shen W, Gibson E, Emonts M (2007) ILR-based MT comprehension test with multi-level questions. Proceedings of The Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL-HLT 2007). Association for Computational Linguistics, Rochester, NY
- (2007) Proceedings of the Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL-HLT 2007)
- Jones, D.¹ Herzog, M.² Ibrahim, H.³ Jairam, A.⁴ Shen, W.⁵ Gibson, E.⁶ Emonts, M.⁷

17
- 0002282074
- A new measure of rank correlation
- Kendall MG (1938) A new measure of rank correlation. Biometrika 30
- (1938) Biometrika 30
- Kendall, M.G.¹

18
- 85120046073
- METEOR: An automatic metric for MT evaluation with high levels of correlation with human judgments
- Prague
- Lavie A, Agarwal A (2007) METEOR: an automatic metric for MT evaluation with high levels of correlation with human judgments. Workshop on statistical machine translation at the 45th annual meeting of the association of computational linguistics (ACL-2007). Prague
- (2007) Workshop on Statistical Machine Translation at the 45th Annual Meeting of the Association of Computational Linguistics (ACL-2007)
- Lavie, A.¹ Agarwal, A.²

19
- 0004149270
- Applied linear statistical models
- Neter J, Kutner M, Nachtsheim C, Wasserman W (1996) Applied linear statistical models. McGraw-Hill/Irwin
- (1996) McGraw-Hill/Irwin
- Neter, J.¹ Kutner, M.² Nachtsheim, C.³ Wasserman, W.⁴

20
- 0004030721
- Computer intensive methods for testing hypotheses
- Wiley, New York
- Noreen EW (1989) Computer intensive methods for testing hypotheses. An introduction. Wiley, New York
- (1989) An Introduction
- Noreen, E.W.¹

21
- 84944098666
- Minimum error rate training in statistical machine translation
- Sapporo, Japan
- Och FJ (2003) Minimum error rate training in statistical machine translation. Association for Computational Linguistics, Sapporo, Japan
- (2003) Association for Computational Linguistics
- Och, F.J.¹

22
- 0141524308
- BLEU: A method for automatic evaluation of machine translation
- Yorktown Heights NY. IBM Research Division
- Papineni K, Roukos S, Ward T, Zhu W-J (2001) BLEU: a method for automatic evaluation of machine translation. Technical Report, Yorktown Heights NY. IBM Research Division
- (2001) Technical Report
- Papineni, K.¹ Roukos, S.² Ward, T.³ Zhu, W.-J.⁴

23
- 85133332738
- Overview of the IWSLT 2006 evaluation campaign
- Kyoto, Japan
- Paul M (2006) Overview of the IWSLT 2006 evaluation campaign. Proceedings of the international workshop on spoken language translation. Kyoto, Japan
- (2006) Proceedings of the International Workshop on Spoken Language Translation
- Paul, M.¹

24
- 0001454867
- On a criterion that a given system of deviations from the probable in the case of a correlated system of variables is such that it can reasonably be supposed to have arisen in random sampling
- Pearson K (1900) On a criterion that a given system of deviations from the probable in the case of a correlated system of variables is such that it can reasonably be supposed to have arisen in random sampling. Philos Mag 50
- (1900) Philos Mag. , vol.50
- Pearson, K.¹

25
- 3843099632
- Accessed 28 Oct. 2009
- Porter M (2009) Snowball: a language for stemming algorithms. 2001. http://snowball.tartarus.org/texts/introduction.html. Accessed 28 Oct. 2009)
- (2001) Snowball: A Language for Stemming Algorithms
- Porter, M.¹

26
- 77954757541
- NIST metrics for machine translation challenge (MetricsMATR)
- National Institute of Standards and Technol ogy, Accessed 28 Oct. 2009
- Przybocki M, Peterson K, Bronsart S (2008) NIST metrics for machine translation challenge (MetricsMATR). NIST Multimodal Information Group. National Institute of Standards and Technol ogy. http://www.nist.gov/speech/tests/ metricsmatr/2008/doc/mm08-evalplan-v1.1pdf. Accessed 28 Oct. 2009)
- (2008) NIST Multimodal Information Group
- Przybocki, M.¹ Peterson, K.² Bronsart, S.³

27
- 77954760686
- On some pitfalls in automatic evaluation and significance testing for MT. ACL-05 workshop on intrinsic and extrinsic evaluation measures for MT and/or summarization
- Ann Arbor, MI
- Riezler J, Maxwell JT (2005) On some pitfalls in automatic evaluation and significance testing for MT. ACL-05 workshop on intrinsic and extrinsic evaluation measures for MT and/or summarization. Association for Computational Linguistics, Ann Arbor, MI
- (2005) Association for Computational Linguistics
- Riezler, J.¹ Maxwell, J.T.²

28
- 77954761840
- Odds of successful transfer of low-level concepts: A key metric for bidirectional speech-to-speech machine translation in DARPA's TRANSTAC program
- European Language Resources Association ELRA, Marrakech, Morocco
- Sanders GA, Bronsart S, Condon S, Schlenoff C (2008) Odds of successful transfer of low-level concepts: a key metric for bidirectional speech-to-speech machine translation in DARPA's TRANSTAC program. Proceedings of the 6th international conference on language resources and evaluation (LREC'08). European Language Resources Association (ELRA), Marrakech, Morocco
- (2008) Proceedings of the 6th International Conference on Language Resources and Evaluation (LREC'08)
- Sanders, G.A.¹ Bronsart, S.² Condon, S.³ Schlenoff, C.⁴

29
- 84857522507
- A study of translation edit rate with targeted human annotation
- Cambridge, MA
- Snover M, Dorr B, Schwartz R, Micciulla L, Makhoul J (2006) A study of translation edit rate with targeted human annotation. Proceedings of association for machine translation in the Americas. Cambridge, MA
- (2006) Proceedings of Association for Machine Translation in the Americas
- Snover, M.¹ Dorr, B.² Schwartz, R.³ Micciulla, L.⁴ Makhoul, J.⁵

30
- 0002965815
- The proof and measurement of association between two things
- Spearman CE (1904) The proof and measurement of association between two things. Am J Psychol 15
- (1904) Am. J. Psychol , vol.15
- Spearman, C.E.¹

31
- 0001884644
- Individual comparisons by ranking methods
- Wilcoxon F (1945) Individual comparisons by ranking methods. Biometrics 1
- (1945) Biometrics 1
- Wilcoxon, F.¹

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.