메뉴 건너뛰기




Volumn , Issue , 2008, Pages 571-580

Statistical power in retrieval experimentation

Author keywords

Evaluation; Retrieval experiment; System measurement

Indexed keywords

EVALUATION; HYBRID METHODOLOGIES; RELEVANCE ASSESSMENTS; RETRIEVAL EXPERIMENT; SAMPLE SIZES; STATISTICAL POWER; SYSTEM MEASUREMENT; TEST SETS;

EID: 70349250276     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1145/1458082.1458158     Document Type: Conference Paper
Times cited : (60)

References (22)
  • 1
    • 33750288965 scopus 로고    scopus 로고
    • J. A. Aslam, V. Pavlu, and E. Yilmaz. A statistical method for system evaluation using incomplete judgments. In S. Dumais, E. Efthimiadis, D. Hawking, and K. Järvelin, editors, Proc. 29th Ann. Int. ACM SIGIR Conf. on Research and Development in Information Retrieval, pages 541-548, Seattle, USA, August 2006.
    • J. A. Aslam, V. Pavlu, and E. Yilmaz. A statistical method for system evaluation using incomplete judgments. In S. Dumais, E. Efthimiadis, D. Hawking, and K. Järvelin, editors, Proc. 29th Ann. Int. ACM SIGIR Conf. on Research and Development in Information Retrieval, pages 541-548, Seattle, USA, August 2006.
  • 2
    • 8644251996 scopus 로고    scopus 로고
    • Retrieval evaluation with incomplete information
    • M. Sanderson, K. Järvelin, J. Allan, and P. Bruza, editors, Sheffield, United Kingdom, August
    • C. Buckley and E. M. Voorhees. Retrieval evaluation with incomplete information. In M. Sanderson, K. Järvelin, J. Allan, and P. Bruza, editors, Proc. 27th Ann. Int. ACM SIGIR Conf. on Research and Development in Information Retrieval, pages 25-32, Sheffield, United Kingdom, August 2004.
    • (2004) Proc. 27th Ann. Int. ACM SIGIR Conf. on Research and Development in Information Retrieval , pp. 25-32
    • Buckley, C.1    Voorhees, E.M.2
  • 3
    • 36448969717 scopus 로고    scopus 로고
    • Robust test collections for retrieval evaluation
    • C. Clarke, N. Fuhr, N. Kando, W. Kraaij, and A. de Vries, editors, Amsterdam, the Netherlands, July
    • B. Carterette. Robust test collections for retrieval evaluation. In C. Clarke, N. Fuhr, N. Kando, W. Kraaij, and A. de Vries, editors, Proc. 30th Ann. Int. ACM SIGIR Conf. on Research and Development in Information Retrieval, pages 55-62, Amsterdam, the Netherlands, July 2007.
    • (2007) Proc. 30th Ann. Int. ACM SIGIR Conf. on Research and Development in Information Retrieval , pp. 55-62
    • Carterette, B.1
  • 4
    • 57349104848 scopus 로고    scopus 로고
    • Hypothesis testing with incomplete relevance judgments
    • M. J. Silvaa, A. A. F. Laender, R. Baeza-Yates, D. L. McGuinness, B. Olstad, Ø. H. Olsen, and A. O. Falcão, editors, Lisboa, Portugal
    • B. Carterette and M. D. Smucker. Hypothesis testing with incomplete relevance judgments. In M. J. Silvaa, A. A. F. Laender, R. Baeza-Yates, D. L. McGuinness, B. Olstad, Ø. H. Olsen, and A. O. Falcão, editors, Proc. 16th ACM Int. Conf. on Information and Knowledge Management, pages 643-652, Lisboa, Portugal, 2007.
    • (2007) Proc. 16th ACM Int. Conf. on Information and Knowledge Management , pp. 643-652
    • Carterette, B.1    Smucker, M.D.2
  • 7
    • 36448963050 scopus 로고    scopus 로고
    • Validity and power of t-test for comparing MAP and GMAP
    • C. Clarke, N. Fuhr, N. Kando, W. Kraaij, and A. de Vries, editors, Amsterdam, the Netherlands, July
    • G. V. Cormack and T. R. Lynam. Validity and power of t-test for comparing MAP and GMAP. In C. Clarke, N. Fuhr, N. Kando, W. Kraaij, and A. de Vries, editors, Proc. 30th Ann. Int. ACM SIGIR Conf. on Research and Development in Information Retrieval, pages 753-754, Amsterdam, the Netherlands, July 2007.
    • (2007) Proc. 30th Ann. Int. ACM SIGIR Conf. on Research and Development in Information Retrieval , pp. 753-754
    • Cormack, G.V.1    Lynam, T.R.2
  • 9
    • 0003818389 scopus 로고
    • Harcourt Brace, Fort Worth, 4th edition
    • W. L. Hays. Statistics. Harcourt Brace, Fort Worth, 4th edition, 1991.
    • (1991) Statistics
    • Hays, W.L.1
  • 10
    • 0035644173 scopus 로고    scopus 로고
    • The abuse of power: The pervasive fallacy of power calculations for data analysis
    • February
    • J. M. Hoenig and D. M. Heisey. The abuse of power: The pervasive fallacy of power calculations for data analysis. The American Statistician, 55 (1):19-24, February 2001.
    • (2001) The American Statistician , vol.55 , Issue.1 , pp. 19-24
    • Hoenig, J.M.1    Heisey, D.M.2
  • 11
    • 66949147248 scopus 로고    scopus 로고
    • Rank-biased precision for measurement of retrieval effectiveness
    • To appear
    • A. Moffat and J. Zobel. Rank-biased precision for measurement of retrieval effectiveness. ACM Transactions on Information Systems, 2009. To appear.
    • (2009) ACM Transactions on Information Systems
    • Moffat, A.1    Zobel, J.2
  • 12
    • 36448950502 scopus 로고    scopus 로고
    • Strategic system comparisons via targeted relevance judgments
    • C. Clarke, N. Fuhr, N. Kando, W. Kraaij, and A. de Vries, editors, Amsterdam, the Netherlands, July
    • A. Moffat, W. Webber, and J. Zobel. Strategic system comparisons via targeted relevance judgments. In C. Clarke, N. Fuhr, N. Kando, W. Kraaij, and A. de Vries, editors, Proc. 30th Ann. Int. ACM SIGIR Conf. on Research and Development in Information Retrieval, pages 375-382, Amsterdam, the Netherlands, July 2007.
    • (2007) Proc. 30th Ann. Int. ACM SIGIR Conf. on Research and Development in Information Retrieval , pp. 375-382
    • Moffat, A.1    Webber, W.2    Zobel, J.3
  • 13
    • 33750340100 scopus 로고    scopus 로고
    • Evaluating evaluation metrics based on the bootstrap
    • S. Dumais, E. Efthimiadis, D. Hawking, and K. Järvelin, editors, Seattle, USA, August
    • T. Sakai. Evaluating evaluation metrics based on the bootstrap. In S. Dumais, E. Efthimiadis, D. Hawking, and K. Järvelin, editors, Proc. 29th Ann. Int. ACM SIGIR Conf. on Research and Development in Information Retrieval, pages 525-532, Seattle, USA, August 2006.
    • (2006) Proc. 29th Ann. Int. ACM SIGIR Conf. on Research and Development in Information Retrieval , pp. 525-532
    • Sakai, T.1
  • 14
    • 36448993626 scopus 로고    scopus 로고
    • Alternatives to Bpref
    • C. Clarke, N. Fuhr, N. Kando, W. Kraaij, and A. de Vries, editors, Amsterdam, the Netherlands, July
    • T. Sakai. Alternatives to Bpref. In C. Clarke, N. Fuhr, N. Kando, W. Kraaij, and A. de Vries, editors, Proc. 30th Ann. Int. ACM SIGIR Conf. on Research and Development in Information Retrieval, pages 71-78, Amsterdam, the Netherlands, July 2007.
    • (2007) Proc. 30th Ann. Int. ACM SIGIR Conf. on Research and Development in Information Retrieval , pp. 71-78
    • Sakai, T.1
  • 15
    • 84885608872 scopus 로고    scopus 로고
    • Information retrieval system evaluation: Effort, sensitivity, and reliability
    • G. Marchionini, A. Moffat, J. Tait, R. Baeza-Yates, and N. Ziviani, editors, Salvador, Brazil, August
    • M. Sanderson and J. Zobel. Information retrieval system evaluation: effort, sensitivity, and reliability. In G. Marchionini, A. Moffat, J. Tait, R. Baeza-Yates, and N. Ziviani, editors, Proc. 28th Ann. Int. ACM SIGIR Conf. on Research and Development in Information Retrieval, pages 162-169, Salvador, Brazil, August 2005.
    • (2005) Proc. 28th Ann. Int. ACM SIGIR Conf. on Research and Development in Information Retrieval , pp. 162-169
    • Sanderson, M.1    Zobel, J.2
  • 16
    • 0031193029 scopus 로고    scopus 로고
    • Statistical inference in retrieval effectiveness evaluation
    • J. Savoy. Statistical inference in retrieval effectiveness evaluation. Information Processing & Management, 33(4):495-512, 1997.
    • (1997) Information Processing & Management , vol.33 , Issue.4 , pp. 495-512
    • Savoy, J.1
  • 17
    • 63449088172 scopus 로고    scopus 로고
    • A comparison of statistical significance tests for information retrieval evaluation
    • M. J. Silvaa, A. A. F. Laender, R. Baeza-Yates, D. L. McGuinness, B. Olstad, Ø. H. Olsen, and A. O. Falcão, editors, Lisboa, Portugal
    • M. D. Smucker, J. Allan, and B. Carterette. A comparison of statistical significance tests for information retrieval evaluation. In M. J. Silvaa, A. A. F. Laender, R. Baeza-Yates, D. L. McGuinness, B. Olstad, Ø. H. Olsen, and A. O. Falcão, editors, Proc. 16th ACM Int. Conf. on Information and Knowledge Management, pages 623-632, Lisboa, Portugal, 2007.
    • (2007) Proc. 16th ACM Int. Conf. on Information and Knowledge Management , pp. 623-632
    • Smucker, M.D.1    Allan, J.2    Carterette, B.3
  • 19
    • 0036993119 scopus 로고    scopus 로고
    • The effect of topic set size on retrieval experiment error
    • K. J rvelin, M. Beaulieu, R. Baeza-Yates, and Hyon Myaeng Sung, editors, Tampere, Finland, August
    • E. M. Voorhees and C. Buckley. The effect of topic set size on retrieval experiment error. In K. J rvelin, M. Beaulieu, R. Baeza-Yates, and Sung Hyon Myaeng, editors, Proc. 25th Ann. Int. ACM SIGIR Conf. on Research and Development in Information Retrieval, pages 316-323, Tampere, Finland, August 2002.
    • (2002) Proc. 25th Ann. Int. ACM SIGIR Conf. on Research and Development in Information Retrieval , pp. 316-323
    • Voorhees, E.M.1    Buckley, C.2
  • 20
    • 57349160444 scopus 로고    scopus 로고
    • Score standardization for inter-collection comparison of retrieval systems
    • S.-H. Myaeng, D. W. Oard, F. Sebastiani, T.-S. Chua, and M.-K. Leong, editors, Singapore, Singapore, July
    • W. Webber, A. Moffat, and J. Zobel. Score standardization for inter-collection comparison of retrieval systems. In S.-H. Myaeng, D. W. Oard, F. Sebastiani, T.-S. Chua, and M.-K. Leong, editors, Proc. 31st Ann. Int. ACM SIGIR Conf. on Research and Development in Information Retrieval, pages 51-58, Singapore, Singapore, July 2008.
    • (2008) Proc. 31st Ann. Int. ACM SIGIR Conf. on Research and Development in Information Retrieval , pp. 51-58
    • Webber, W.1    Moffat, A.2    Zobel, J.3
  • 21
    • 34547632535 scopus 로고    scopus 로고
    • Estimating average precision with incomplete and imperfect judgments
    • Arlington, Virginia, USA, November
    • E. Yilmaz and J. A. Aslam. Estimating average precision with incomplete and imperfect judgments. In Proc. 15th ACM Int. Conf. on Information and Knowledge Management, pages 102-111, Arlington, Virginia, USA, November 2006.
    • (2006) Proc. 15th ACM Int. Conf. on Information and Knowledge Management , pp. 102-111
    • Yilmaz, E.1    Aslam, J.A.2
  • 22
    • 0032272626 scopus 로고    scopus 로고
    • How reliable are the results of large-scale information retrieval experiments?
    • W. B. Croft, A. Moffat, C. J. van Rijsbergen, R. Wilkinson, and J. Zobel, editors, Melbourne, Australia, August
    • J. Zobel. How reliable are the results of large-scale information retrieval experiments? In W. B. Croft, A. Moffat, C. J. van Rijsbergen, R. Wilkinson, and J. Zobel, editors, Proc. 21st Ann. Int. ACM SIGIR Conf. on Research and Development in Information Retrieval, pages 307-314, Melbourne, Australia, August 1998.
    • (1998) Proc. 21st Ann. Int. ACM SIGIR Conf. on Research and Development in Information Retrieval , pp. 307-314
    • Zobel, J.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.