SCOPUS 정보 검색 플랫폼

Volumn , Issue , 2013, Pages 393-402

On the measurement of test collection reliability

Author keywords

Evaluation; Generalizability Theory; Reliability; Test Collection; TREC

Indexed keywords

CONFIDENCE INTERVAL; EVALUATION; GENERALIZABILITY THEORIES; ORDERS OF MAGNITUDE; RELIABILITY INDICATORS; STATISTICAL THEORY; TEST COLLECTION; TREC;

INFORMATION RETRIEVAL; RELIABILITY; RESEARCH; TESTING;

RELIABILITY THEORY;

EID: 84883095495 PISSN: None EISSN: None Source Type: Conference Proceeding
DOI: 10.1145/2484028.2484038 Document Type: Conference Paper

Times cited : (33)

References (24)

1
- 79952433615
- Million query track 2008 overview
- J. Allan, J. A. Aslam, B. Carterette, V. Pavlu, and E. Kanoulas. Million Query Track 2008 Overview. In Text REtrieval Conference, 2008.
- (2008) Text REtrieval Conference
- Allan, J.¹ Aslam, J.A.² Carterette, B.³ Pavlu, V.⁴ Kanoulas, E.⁵

2
- 79952433615
- Million query track 2007 overview
- J. Allan, B. Carterette, J. A. Aslam, V. Pavlu, B. Dachev, and E. Kanoulas. Million Query Track 2007 Overview. In Text REtrieval Conference, 2007.
- (2007) Text REtrieval Conference
- Allan, J.¹ Carterette, B.² Aslam, J.A.³ Pavlu, V.⁴ Dachev, B.⁵ Kanoulas, E.⁶

3
- 0042983740
- Confidence intervals for proportions of total variance in the two-way cross component of variance model
- C. Arteaga, S. Jeyaratnam, and G. A. Franklin. Confidence Intervals for Proportions of Total Variance in the Two-Way Cross Component of Variance Model. Communications in Statistics: Theory and Methods, 11(15):1643-1658, 1982.
- (1982) Communications in Statistics: Theory and Methods , vol.11 , Issue.15 , pp. 1643-1658
- Arteaga, C.¹ Jeyaratnam, S.² Franklin, G.A.³

4
- 0002500408
- Blind men and elephants: Six approaches to TREC data
- D. Banks, P. Over, and N.-F. Zhang. Blind Men and Elephants: Six Approaches to TREC data. Information Retrieval, 1(1-2):7-34, 1999.
- (1999) Information Retrieval , vol.1 , Issue.1-2 , pp. 7-34
- Banks, D.¹ Over, P.² Zhang, N.-F.³

5
- 40649085568
- Test theory for evaluating reliability of IR test collections
- D. Bodoff. Test Theory for Evaluating Reliability of IR Test Collections. Information Processing and Management, 44(3):1117-1145, 2008.
- (2008) Information Processing and Management , vol.44 , Issue.3 , pp. 1117-1145
- Bodoff, D.¹

6
- 36448947171
- Test theory for assessing IR test collections
- D. Bodoff and P. Li. Test Theory for Assessing IR Test Collections. In ACM SIGIR, pages 367-374, 2007.
- (2007) ACM SIGIR , pp. 367-374
- Bodoff, D.¹ Li, P.²

7
- 0004225983
- Springer
- R. L. Brennan. Generalizability Theory. Springer, 2001.
- (2001) Generalizability Theory
- Brennan, R.L.¹

8
- 0033650323
- Evaluating evaluation measure stability
- C. Buckley and E. M. Voorhees. Evaluating Evaluation Measure Stability. In ACM SIGIR, pages 33-34, 2000.
- (2000) ACM SIGIR , pp. 33-34
- Buckley, C.¹ Voorhees, E.M.²

9
- 57349133736
- Evaluation over thousands of queries
- B. Carterette, V. Pavlu, E. Kanoulas, J. A. Aslam, and J. Allan. Evaluation Over Thousands of Queries. In ACM SIGIR, pages 651-658, 2008.
- (2008) ACM SIGIR , pp. 651-658
- Carterette, B.¹ Pavlu, V.² Kanoulas, E.³ Aslam, J.A.⁴ Allan, J.⁵

10
- 84873440989
- If I had a million queries
- B. Carterette, V. Pavlu, E. Kanoulas, J. A. Aslam, and J. Allan. If I Had a Million Queries. In ECIR, pages 288-300, 2009.
- (2009) ECIR , pp. 288-300
- Carterette, B.¹ Pavlu, V.² Kanoulas, E.³ Aslam, J.A.⁴ Allan, J.⁵

11
- 0013801920
- The approximate sampling distribution of kuder-richardson reliability coefficient twenty
- L. S. Feldt. The Approximate Sampling Distribution of Kuder-Richardson Reliability Coefficient Twenty. Psychometrika, 30(3):357-370, 1965.
- (1965) Psychometrika , vol.30 , Issue.3 , pp. 357-370
- Feldt, L.S.¹

12
- 74549195035
- Empirical justification of the gain and discount function for nDCG
- E. Kanoulas and J. A. Aslam. Empirical Justification of the Gain and Discount Function for nDCG. In ACM CIKM, pages 611-620, 2009.
- (2009) ACM CIKM , pp. 611-620
- Kanoulas, E.¹ Aslam, J.A.²

13
- 33750327186
- Revisiting the effect of topic set size on retrieval error
- W.-H. Lin and A. Hauptmann. Revisiting the Effect of Topic Set Size on Retrieval Error. In ACM SIGIR, pages 637-638, 2005.
- (2005) ACM SIGIR , pp. 637-638
- Lin, W.-H.¹ Hauptmann, A.²

14
- 84866617782
- On per-topic variance in IR evaluation
- S. Robertson and E. Kanoulas. On Per-Topic Variance in IR Evaluation. In ACM SIGIR, pages 891-900, 2012.
- (2012) ACM SIGIR , pp. 891-900
- Robertson, S.¹ Kanoulas, E.²

15
- 33750437740
- On the reliability of information retrieval metrics based on graded relevance
- T. Sakai. On the Reliability of Information Retrieval Metrics Based on Graded Relevance. Information Processing and Management, 43(2):531-548, 2007.
- (2007) Information Processing and Management , vol.43 , Issue.2 , pp. 531-548
- Sakai, T.¹

16
- 77954220071
- Test collection based evaluation of information retrieval systems
- M. Sanderson. Test Collection Based Evaluation of Information Retrieval Systems. Foundations and Trends in Information Retrieval, 4(4):247-375, 2010.
- (2010) Foundations and Trends in Information Retrieval , vol.4 , Issue.4 , pp. 247-375
- Sanderson, M.¹

17
- 84885608872
- Information retrieval system evaluation: Effort, sensitivity, and reliability
- M. Sanderson and J. Zobel. Information Retrieval System Evaluation: Effort, Sensitivity, and Reliability. In ACM SIGIR, pages 162-169, 2005.
- (2005) ACM SIGIR , pp. 162-169
- Sanderson, M.¹ Zobel, J.²

18
- 0003580433
- Sage Publications
- R. J. Shavelson and N. M. Webb. Generalizability Theory: A Primer. Sage Publications, 1991.
- (1991) Generalizability Theory: A Primer
- Shavelson, R.J.¹ Webb, N.M.²

19
- 84883083423
- A comparison of the Optimality of statistical significance tests for information retrieval evaluation
- J. Urbano, M. Marrero, and D. Martín. A Comparison of the Optimality of Statistical Significance Tests for Information Retrieval Evaluation. In ACM SIGIR, 2013.
- (2013) ACM SIGIR
- Urbano, J.¹ Marrero, M.² Martín, D.³

20
- 0032264624
- Variations in relevance judgments and the measurement of retrieval effectiveness
- E. M. Voorhees. Variations in Relevance Judgments and the Measurement of Retrieval Effectiveness. In ACM SIGIR, pages 315-323, 1998.
- (1998) ACM SIGIR , pp. 315-323
- Voorhees, E.M.¹

21
- 72449211066
- Topic set size redux
- E. M. Voorhees. Topic Set Size Redux. In ACM SIGIR, pages 806-807, 2009.
- (2009) ACM SIGIR , pp. 806-807
- Voorhees, E.M.¹

22
- 0036993119
- The effect of topic set size on retrieval experiment error
- E. M. Voorhees and C. Buckley. The Effect of Topic Set Size on Retrieval Experiment Error. In ACM SIGIR, pages 316-323, 2002.
- (2002) ACM SIGIR , pp. 316-323
- Voorhees, E.M.¹ Buckley, C.²

23
- 57349152359
- A new rank correlation coefficient for information retrieval
- E. Yilmaz, J. A. Aslam, and S. Robertson. A New Rank Correlation Coefficient for Information Retrieval. In ACM SIGIR, pages 587-594, 2008.
- (2008) ACM SIGIR , pp. 587-594
- Yilmaz, E.¹ Aslam, J.A.² Robertson, S.³

24
- 0032272626
- How reliable are the results of large-scale information retrieval experiments?
- J. Zobel. How Reliable are the Results of Large-Scale Information Retrieval Experiments? In ACM SIGIR, pages 307-314, 1998.
- (1998) ACM SIGIR , pp. 307-314
- Zobel, J.¹

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.