-
2
-
-
33750288965
-
A statistical method for system evaluation using incomplete judgments
-
J. A. Aslam, V. Pavlu, and E. Yilmaz. A statistical method for system evaluation using incomplete judgments. In Proc. SIGIR, 2006.
-
(2006)
Proc. SIGIR
-
-
Aslam, J.A.1
Pavlu, V.2
Yilmaz, E.3
-
3
-
-
57349152836
-
The CSIRO enterprise search test collection
-
December
-
P. Bailey, N. Craswell, I. Soboroff, and A. P. de Vries. The CSIRO enterprise search test collection. SIGIR Forum, 41(2), December 2007.
-
(2007)
SIGIR Forum
, vol.41
, Issue.2
-
-
Bailey, P.1
Craswell, N.2
Soboroff, I.3
de Vries, A.P.4
-
4
-
-
8644251996
-
Retrieval evaluation with incomplete information
-
C. Buckley and E. M. Voorhees. Retrieval evaluation with incomplete information. In Proc. SIGIR, 2004.
-
(2004)
Proc. SIGIR
-
-
Buckley, C.1
Voorhees, E.M.2
-
5
-
-
0013287054
-
Variations in relevance judgments and the evaluation of retrieval performance
-
Sep-Oct
-
R. Burgin. Variations in relevance judgments and the evaluation of retrieval performance. Information Processing & Management, 28(5):619-627, Sep-Oct 1992.
-
(1992)
Information Processing & Management
, vol.28
, Issue.5
, pp. 619-627
-
-
Burgin, R.1
-
6
-
-
84937275232
-
Assessing agreement on classification tasks: The kappa statistic
-
J. Carletta. Assessing agreement on classification tasks: The kappa statistic. Computational Linguistics, 22(2):249-254, 1996.
-
(1996)
Computational Linguistics
, vol.22
, Issue.2
, pp. 249-254
-
-
Carletta, J.1
-
7
-
-
0005655729
-
The effect of variations in relevance assessments in comparative experimental tests of index languages
-
Cranfield Institute of Technology
-
C. W. Cleverdon. The effect of variations in relevance assessments in comparative experimental tests of index languages. Technical Report ASLIB part 2, Cranfield Institute of Technology, 1970.
-
(1970)
Technical Report ASLIB part
, vol.2
-
-
Cleverdon, C.W.1
-
10
-
-
2142668188
-
The kappa statistic: A second look
-
B. D. Eugenio and M. Glass. The kappa statistic: a second look. Computational Linguistics, 30(1):95-101, 2004.
-
(2004)
Computational Linguistics
, vol.30
, Issue.1
, pp. 95-101
-
-
Eugenio, B.D.1
Glass, M.2
-
11
-
-
0001769424
-
Variations in relevance assessments and the measurement of retrieval effectiveness
-
S. P. Harter. Variations in relevance assessments and the measurement of retrieval effectiveness. JASIS, 47(1):37-49, 1996.
-
(1996)
JASIS
, vol.47
, Issue.1
, pp. 37-49
-
-
Harter, S.P.1
-
12
-
-
0033645041
-
IR evaluation methods for retrieving highly relevant documents
-
K. Järvelin and J. Kekäläinen. IR evaluation methods for retrieving highly relevant documents. In Proc. SIGIR, 2000.
-
(2000)
Proc. SIGIR
-
-
Järvelin, K.1
Kekäläinen, J.2
-
14
-
-
0009233105
-
Relevance assessments and retrieval system evaluation
-
M. E. Lesk and G. Salton. Relevance assessments and retrieval system evaluation. Information Storage and Retrieval, 4:343-359, 1969.
-
(1969)
Information Storage and Retrieval
, vol.4
, pp. 343-359
-
-
Lesk, M.E.1
Salton, G.2
-
16
-
-
85050172503
-
Statistical Techniques for the Study of Language and Language Behaviour
-
R. Rietveld and R. van Hout. Statistical Techniques for the Study of Language and Language Behaviour. Mouton de Gray ter, 1993.
-
(1993)
Mouton de Gray ter
-
-
Rietveld, R.1
van Hout, R.2
-
18
-
-
36448954593
-
A comparison of pooled and sampled relevance judgments
-
I. Soboroff. A comparison of pooled and sampled relevance judgments. In Proc. SIGIR, 2007.
-
(2007)
Proc. SIGIR
-
-
Soboroff, I.1
-
19
-
-
0036989640
-
Liberal relevance criteria of TREC: Counting on negligible documents?
-
E. Sormunen. Liberal relevance criteria of TREC: counting on negligible documents? In Proc. SIGIR, 2002.
-
(2002)
Proc. SIGIR
-
-
Sormunen, E.1
-
20
-
-
84876705138
-
IR Evaluation Using Multiple Assessors per Topic
-
A. Trotman and D. Jenkinson. IR Evaluation Using Multiple Assessors per Topic. In Proc. ADCS, 2007.
-
(2007)
Proc. ADCS
-
-
Trotman, A.1
Jenkinson, D.2
-
22
-
-
0032264624
-
Variations in relevance judgments and the measurement of retrieval effectiveness
-
E. M. Voorhees. Variations in relevance judgments and the measurement of retrieval effectiveness. In Proc. SIGIR, 1998.
-
(1998)
Proc. SIGIR
-
-
Voorhees, E.M.1
-
24
-
-
34547632535
-
Estimating average precision with incomplete and imperfect judgments
-
E. Yilmaz and J. A. Aslam. Estimating average precision with incomplete and imperfect judgments. In Proc. CIKM, 2006.
-
(2006)
Proc. CIKM
-
-
Yilmaz, E.1
Aslam, J.A.2
-
25
-
-
57349107098
-
A simple and efficient sampling method for estimating AP and NDCG
-
E. Yilmaz, E. Kanoulas, and J. Aslam.. A simple and efficient sampling method for estimating AP and NDCG. In Proc. SIGIR, 2008.
-
(2008)
Proc. SIGIR
-
-
Yilmaz, E.1
Kanoulas, E.2
Aslam, J.3
|