메뉴 건너뛰기




Volumn , Issue , 2013, Pages 989-998

Towards minimizing the annotation cost of certified text classification

Author keywords

E discovery; Evaluation; Supervised learning; Text categorization

Indexed keywords

ALLOCATION POLICIES; ANALYTIC APPROXIMATION; CONFIDENCE INTERVAL; E DISCOVERIES; EVALUATION; STATISTICAL VALIDITY; TEXT CATEGORIZATION; TEXT CLASSIFICATION;

EID: 84889598884     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1145/2505515.2505708     Document Type: Conference Paper
Times cited : (17)

References (32)
  • 1
    • 84925604888 scopus 로고    scopus 로고
    • No unbiased estimator of the variance of k-fold cross-validation
    • September
    • Y. Bengio and Y. Grandvalet. No unbiased estimator of the variance of k-fold cross-validation. Journal of Machine Learning Research, 5:1089-1105, September 2004.
    • (2004) Journal of Machine Learning Research , vol.5 , pp. 1089-1105
    • Bengio, Y.1    Grandvalet, Y.2
  • 3
    • 13344280339 scopus 로고    scopus 로고
    • One-sided confidence intervals in discrete distributions
    • T. T. Cai. One-sided confidence intervals in discrete distributions. Journal of Statistical Planning and Inference, 131:63-88, 2005.
    • (2005) Journal of Statistical Planning and Inference , vol.131 , pp. 63-88
    • Cai, T.T.1
  • 6
  • 7
    • 84957069814 scopus 로고    scopus 로고
    • Text categorization with support vector machines: Learning with many relevant features
    • T. Joachims. Text categorization with support vector machines: Learning with many relevant features. In ECML, pages 137-142, 1998.
    • (1998) ECML , pp. 137-142
    • Joachims, T.1
  • 8
    • 31844446804 scopus 로고    scopus 로고
    • A support vector method for multivariate performance measures
    • T. Joachims. A support vector method for multivariate performance measures. In ICML, pages 377-384, 2005.
    • (2005) ICML , pp. 377-384
    • Joachims, T.1
  • 9
    • 33749563073 scopus 로고    scopus 로고
    • Training linear svms in linear time
    • T. Joachims. Training linear svms in linear time. In KDD, pages 217-226, 2006.
    • (2006) KDD , pp. 217-226
    • Joachims, T.1
  • 10
    • 68949154453 scopus 로고    scopus 로고
    • Sparse kernel svms via cutting-plane training
    • T. Joachims and C.-N. J. Yu. Sparse kernel svms via cutting-plane training. In ECML PKDD: Part I, 2009.
    • (2009) ECML PKDD: Part I
    • Joachims, T.1    Yu, C.-N.J.2
  • 12
    • 79951773917 scopus 로고    scopus 로고
    • Technical report, Electronic Discovery Institute October
    • A. Kershaw and J. Howie. eDiscovery institute survey on predictive coding. Technical report, Electronic Discovery Institute (http://www. ediscoveryinstitute.org/pubs/PredictiveCodingSurvey.pdf), October 2010.
    • (2010) EDiscovery Institute Survey on Predictive Coding
    • Kershaw, A.1    Howie, J.2
  • 13
    • 80053225505 scopus 로고    scopus 로고
    • Combining train set and test set bounds
    • J. Langford. Combining train set and test set bounds. In ICML, pages 331-338, 2002.
    • (2002) ICML , pp. 331-338
    • Langford, J.1
  • 14
    • 21844462365 scopus 로고    scopus 로고
    • Tutorial on practical prediction theory for classification
    • J. Langford. Tutorial on practical prediction theory for classification. Journal of Machine Learning Research, 6(1):273-306, 2005.
    • (2005) Journal of Machine Learning Research , vol.6 , Issue.1 , pp. 273-306
    • Langford, J.1
  • 16
    • 85013879626 scopus 로고
    • A sequential algorithm for training text classifiers
    • D. D. Lewis and W. A. Gale. A sequential algorithm for training text classifiers. In SIGIR, pages 3-12, 1994.
    • (1994) SIGIR , pp. 3-12
    • Lewis, D.D.1    Gale, W.A.2
  • 17
    • 84876811202 scopus 로고    scopus 로고
    • RCV1: A new benchmark collection for text categorization research
    • December
    • D. D. Lewis, Y. Yang, T. G. Rose, and F. Li. RCV1: A new benchmark collection for text categorization research. Journal of Machine Learning Research, 5:361-397, December 2004.
    • (2004) Journal of Machine Learning Research , vol.5 , pp. 361-397
    • Lewis, D.D.1    Yang, Y.2    Rose, T.G.3    Li, F.4
  • 19
    • 0042847140 scopus 로고    scopus 로고
    • Inference for the generalization error
    • C. Nadeau and Y. Bengio. Inference for the generalization error. Machine Learning, 52:239-281, 2003.
    • (2003) Machine Learning , vol.52 , pp. 239-281
    • Nadeau, C.1    Bengio, Y.2
  • 21
    • 84887928716 scopus 로고    scopus 로고
    • Where the money goes: Understanding litigant expenditures for producing electronic discovery
    • Santa Monica, CA
    • N. M. Pace and L. Zakaras. Where the money goes: Understanding litigant expenditures for producing electronic discovery. Technical report, RAND Institute for Civil Justice, Santa Monica, CA, 2012.
    • (2012) Technical Report, RAND Institute for Civil Justice
    • Pace, N.M.1    Zakaras, L.2
  • 22
    • 0002515248 scopus 로고    scopus 로고
    • Efficient progressive sampling
    • F. Provost, D. Jensen, and T. Oates. Efficient progressive sampling. In KDD, pages 23-32, 1999.
    • (1999) KDD , pp. 23-32
    • Provost, F.1    Jensen, D.2    Oates, T.3
  • 23
    • 45549117987 scopus 로고
    • Term-weighting approaches in automatic text retrieval
    • G. Salton and C. Buckley. Term-weighting approaches in automatic text retrieval. Information Processing and Management, 24(5):513-523, 1988.
    • (1988) Information Processing and Management , vol.24 , Issue.5 , pp. 513-523
    • Salton, G.1    Buckley, C.2
  • 24
    • 0037245343 scopus 로고    scopus 로고
    • Pitfalls in the use of DNA microarray data for diagnostic and prognostic classification
    • January
    • R. Simon, M. D. Radmacher, K. Dobbin, and L. M. McShane. Pitfalls in the use of DNA microarray data for diagnostic and prognostic classification. Journal of the National Cancer Institute, 95(1):14-18, January 2003.
    • (2003) Journal of the National Cancer Institute , vol.95 , Issue.1 , pp. 14-18
    • Simon, R.1    Radmacher, M.D.2    Dobbin, K.3    McShane, L.M.4
  • 27
    • 0016082639 scopus 로고
    • Bibliography on estimation of misclassification
    • July
    • G. T. Toussaint. Bibliography on estimation of misclassification. IEEE Transactions on Information Theory, IT-20(4):472-479, July 1974.
    • (1974) IEEE Transactions on Information Theory , vol.IT-20 , Issue.4 , pp. 472-479
    • Toussaint, G.T.1
  • 29
    • 84873884780 scopus 로고    scopus 로고
    • Approximate recall confidence intervals
    • W. Webber. Approximate recall confidence intervals. ACM Transactions on Information Systems, 31(1):2:1-33, 2013.
    • (2013) ACM Transactions on Information Systems , vol.31 , Issue.1-2 , pp. 1-33
    • Webber, W.1
  • 30
    • 84883057552 scopus 로고    scopus 로고
    • Sequential testing in classifier evaluation yields biased estimates of effectiveness
    • July
    • W. Webber, M. Bagdouri, D. D. Lewis, and D. W. Oard. Sequential testing in classifier evaluation yields biased estimates of effectiveness. In SIGIR, pages 933-936, July 2013.
    • (2013) SIGIR , pp. 933-936
    • Webber, W.1    Bagdouri, M.2    Lewis, D.D.3    Oard, D.W.4
  • 32
    • 85024373635 scopus 로고    scopus 로고
    • A re-examination of text categorization methods
    • Y. Yang and X. Liu. A re-examination of text categorization methods. In SIGIR, pages 42-49, 1999.
    • (1999) SIGIR , pp. 42-49
    • Yang, Y.1    Liu, X.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.