메뉴 건너뛰기




Volumn 43, Issue 2, 2007, Pages 531-548

On the reliability of information retrieval metrics based on graded relevance

Author keywords

Cumulative gain; Evaluation; Graded relevance; Q measure; Reliability

Indexed keywords

DATABASE SYSTEMS; Q FACTOR MEASUREMENT; SENSITIVITY ANALYSIS;

EID: 33750437740     PISSN: 03064573     EISSN: None     Source Type: Journal    
DOI: 10.1016/j.ipm.2006.07.020     Document Type: Article
Times cited : (87)

References (20)
  • 1
    • 0033650323 scopus 로고    scopus 로고
    • Buckley, C., & Voorhees, E. M. (2000). Evaluating evaluation measure stability. In Proceedings of the 23rd annual international ACM SIGIR conference on research and development in information retrieval (SIGIR 2000) (pp. 33-40).
  • 2
    • 8644251996 scopus 로고    scopus 로고
    • Buckley, C., & Voorhees, E. M. (2004). Retrieval evaluation with incomplete information. In Proceedings of the 27th annual international ACM SIGIR conference on research and development in information retrieval (SIGIR 2004) (pp. 25-32).
  • 3
    • 33750429203 scopus 로고    scopus 로고
    • Chen, K.-H., Chen, H.-H., Kando, N., Kuriyama, K., Lee, S., Myaeng, S.-H., et al. (2003). Overview of CLIR task at the third NTCIR workshop. In Proceedings of the 3rd NTCIR workshop on research in information retrieval, automatic text summarization and question answering (NTCIR-3).
  • 5
    • 0027725490 scopus 로고    scopus 로고
    • Hull, D. (1993). Using statistical testing in the evaluation of retrieval experiments. In Proceedings of the 16th annual international ACM SIGIR conference on research and development in information retrieval (SIGIR '93) (pp. 329-338).
  • 7
    • 16244362331 scopus 로고    scopus 로고
    • Binary and graded relevance in IR evaluations - comparison of the effects on ranking of IR systems
    • Kekäläinen J. Binary and graded relevance in IR evaluations - comparison of the effects on ranking of IR systems. Information Processing and Management 41 (2005) 1019-1033
    • (2005) Information Processing and Management , vol.41 , pp. 1019-1033
    • Kekäläinen, J.1
  • 9
    • 1542377483 scopus 로고    scopus 로고
    • Sakai, T. (2003). Average gain ratio: a simple retrieval performance measure for evaluation with multiple relevance levels. In Proceedings of the 26th annual international ACM SIGIR conference on research and development in information retrieval (SIGIR 2003) (pp. 417-418).
  • 10
    • 33750477580 scopus 로고    scopus 로고
    • Sakai, T. (2004). Ranking the NTCIR systems based on multigrade relevance. In Proceedings of Asia information retrieval symposium 2004 (pp. 170-177).
  • 11
    • 33750445886 scopus 로고    scopus 로고
    • Sakai, T. (2005a). The effect of topic sampling on sensitivity comparisons of information retrieval metrics. In Proceedings of the 5th NTCIR workshop on research in information access technologies (NTCIR-5).
  • 12
    • 33646126694 scopus 로고    scopus 로고
    • Sakai, T. (2005b). The reliability of metrics based on graded relevance. In Proceedings of Asia information retrieval symposium 2005. Lecture notes in computer science: Vol. 3689 (pp. 1-16).
  • 13
    • 33750340100 scopus 로고    scopus 로고
    • Sakai, T. (2006a). Evaluating evaluation metrics based on the bootstrap. In Proceedings of the 29th annual international ACM SIGIR conference on research and development in information retrieval (SIGIR 2006).
  • 14
    • 33750307579 scopus 로고    scopus 로고
    • Sakai, T. (2006b). Give me just one highly relevant document: P-measure. In Proceedings of the 29th annual international ACM SIGIR conference on research and development in information retrieval (SIGIR 2006).
  • 15
    • 33750489348 scopus 로고    scopus 로고
    • Sakai, T. (2006c). On the task of finding one highly relevant document with high precision. Information Processing Society of Japan Transactions on Databases 47 SIG4 (TOD29), 13-27.
  • 16
    • 84885608872 scopus 로고    scopus 로고
    • Sanderson, M., & Zobel, J. (2005). Information retrieval system evaluation: effort, sensitivity, and reliability. In Proceedings of the 28th annual international ACM SIGIR conference on research and development in information retrieval (SIGIR 2005) (pp. 162-169).
  • 17
    • 0036993119 scopus 로고    scopus 로고
    • Voorhees, E. M., & Buckley, C. (2002). The effect of topic set size on retrieval experiment error. In Proceedings of the 25th annual international ACM SIGIR conference on research and development in information retrieval (SIGIR 2002) (pp. 316-323).
  • 18
    • 0034795668 scopus 로고    scopus 로고
    • Voorhees, E. M. (2001). Evaluation by highly relevant documents. In Proceedings of the 24th annual international ACM SIGIR conference on research and development in information retrieval (SIGIR 2001) (pp. 74-82).
  • 19
    • 33750487749 scopus 로고    scopus 로고
    • Voorhees, E. M. (2005). Overview of the TREC 2004 robust retrieval track. In Proceedings of the 13th text retrieval conference (TREC 2004).
  • 20
    • 33646155934 scopus 로고    scopus 로고
    • Vu, H.-T., & Gallinari, P. (2005). On effectiveness measures and relevance functions in ranking INEX systems. In Proceedings of Asia information retrieval symposium 2005. Lecture notes in computer science: Vol. 3689 (pp. 312-327).


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.