메뉴 건너뛰기




Volumn 37, Issue 2, 2007, Pages 350-360

Stemming via distribution-based word segregation for classification and retrieval

Author keywords

Precision and recall; Prototype selection; Stemming; Text categorization

Indexed keywords

ALGORITHMS; CLASSIFICATION (OF INFORMATION); DATA STRUCTURES; INFORMATION RETRIEVAL; TEXT PROCESSING;

EID: 34047191837     PISSN: 10834419     EISSN: None     Source Type: Journal    
DOI: 10.1109/TSMCB.2006.885307     Document Type: Article
Times cited : (12)

References (39)
  • 1
    • 0000417994 scopus 로고
    • "Developments in automatic text retrieval"
    • Aug
    • G. Salton, "Developments in automatic text retrieval," Science, vol. 253, no. 5023, pp. 974-980, Aug. 1991.
    • (1991) Science , vol.253 , Issue.5023 , pp. 974-980
    • Salton, G.1
  • 2
    • 0030413815 scopus 로고    scopus 로고
    • "Viewing stemming as recall enhancement"
    • in Zurich, Switzerland, Aug
    • W. Kraaij and R. Pohlmann, "Viewing stemming as recall enhancement," in Proc. l7th ACM SIGIR Conf., Zurich, Switzerland, Aug. 1996, pp. 40-48.
    • (1996) Proc. 17th ACM SIGIR Conf. , pp. 40-48
    • Kraaij, W.1    Pohlmann, R.2
  • 3
    • 33646350476 scopus 로고    scopus 로고
    • "Strength and similarity of affix removal stemming algorithms"
    • W. B. Frakes and C. J. Fox, "Strength and similarity of affix removal stemming algorithms," ACM SIGIR Forum, vol. 37, no. 1, pp. 26-30, 2003.
    • (2003) ACM SIGIR Forum , vol.37 , Issue.1 , pp. 26-30
    • Frakes, W.B.1    Fox, C.J.2
  • 6
    • 84948481845 scopus 로고
    • "An algorithm for suffix stripping"
    • M. F. Porter, "An algorithm for suffix stripping," Program, vol. 14, no. 3, pp. 130-137, 1980.
    • (1980) Program , vol.14 , Issue.3 , pp. 130-137
    • Porter, M.F.1
  • 7
    • 0031599183 scopus 로고    scopus 로고
    • "Corpus-based stermning using coocurrence of word variants"
    • J. Xu and W. B. Croft, "Corpus-based stermning using coocurrence of word variants," ACM Trans. Inf. Syst., vol. 16, no. 1, pp. 61-81, 1998.
    • (1998) ACM Trans. Inf. Syst. , vol.16 , Issue.1 , pp. 61-81
    • Xu, J.1    Croft, W.B.2
  • 8
    • 0032264186 scopus 로고    scopus 로고
    • "Distributional clustering of words for text classification"
    • in Melbourne, Australia
    • L. D. Baker and A. K. McCallum, "Distributional clustering of words for text classification," in Proc. 21stACMSIGIR Conf., Melbourne, Australia, 1998, pp. 96-103.
    • (1998) Proc. 21st ACM SIGIR Conf. , pp. 96-103
    • Baker, L.D.1    McCallum, A.K.2
  • 9
    • 0016572913 scopus 로고
    • "A vector space model for automatic indexing"
    • Nov
    • G. Salton, A. Wong, and C. S. Yang, "A vector space model for automatic indexing," Commun. ACM, vol. 18, no. 11, pp. 613-620, Nov. 1975.
    • (1975) Commun. ACM , vol.18 , Issue.11 , pp. 613-620
    • Salton, G.1    Wong, A.2    Yang, C.S.3
  • 10
    • 84989569822 scopus 로고    scopus 로고
    • "How effective is suffixing?"
    • D. Harman, "How effective is suffixing?" J. Amer. Soc. Inf. Sci., vol. 42, no. 1, pp. 7-15.
    • J. Amer. Soc. Inf. Sci. , vol.42 , Issue.1 , pp. 7-15
    • Harman, D.1
  • 11
    • 0001794236 scopus 로고
    • "Development of a stemming algorithm"
    • J. B. Lovins, "Development of a stemming algorithm," Mech. Transl. Comput. Linguist., vol. 11, no. 1/2, pp. 22-31, 1968.
    • (1968) Mech. Transl. Comput. Linguist. , vol.11 , Issue.1-2 , pp. 22-31
    • Lovins, J.B.1
  • 12
    • 84925887156 scopus 로고
    • "Suffix removal for word conflation"
    • J. L. Dawson, "Suffix removal for word conflation," Bull. Assoc. Lit. Linguist. Comput., vol. 2, no. 3, pp. 33-46, 1974.
    • (1974) Bull. Assoc. Lit. Linguist. Comput. , vol.2 , Issue.3 , pp. 33-46
    • Dawson, J.L.1
  • 13
    • 84976754255 scopus 로고
    • "Another stemmer"
    • C. D. Paice, "Another stemmer," SIGIR Forum, vol. 24, no. 3, pp. 56-61, 1990.
    • (1990) SIGIR Forum , vol.24 , Issue.3 , pp. 56-61
    • Paice, C.D.1
  • 14
    • 0027766985 scopus 로고
    • "Viewing morphology as an inference process"
    • in Pittsburgh, PA
    • R. Krovetz, "Viewing morphology as an inference process," in Proc. l6th ACM SIGIR Conf., Pittsburgh, PA, 1993, pp. 191-202.
    • (1993) Proc. 16th ACM SIGIR Conf. , pp. 191-202
    • Krovetz, R.1
  • 15
    • 0033650819 scopus 로고    scopus 로고
    • "Stemming and its effects on TFIDF ranking"
    • in Athens, Greece
    • M. Kantrowitz, B. Mohit, and V. Mittal, "Stemming and its effects on TFIDF ranking," in Proc. 23rd Annu. SIGIR Conf., Athens, Greece, 2000, pp. 357-359.
    • (2000) Proc. 23rd Annu. SIGIR Conf. , pp. 357-359
    • Kantrowitz, M.1    Mohit, B.2    Mittal, V.3
  • 16
    • 34047136156 scopus 로고    scopus 로고
    • "Accurate stemming of Dutch for text classification"
    • T. Gustad and G. Bouma, "Accurate stemming of Dutch for text classification, " Lang. Comput., vol. 45, no. 1, pp. 104-117, 2002.
    • (2002) Lang. Comput. , vol.45 , Issue.1 , pp. 104-117
    • Gustad, T.1    Bouma, G.2
  • 17
    • 84983165279 scopus 로고    scopus 로고
    • "A probabilistic model for stemmer generation"
    • M. Bacchin, N. Ferro, and M. Melucci, "A probabilistic model for stemmer generation," Inf. Process. Manage., vol. 41, no. 1, pp. 121-137, 2005.
    • (2005) Inf. Process. Manage. , vol.41 , Issue.1 , pp. 121-137
    • Bacchin, M.1    Ferro, N.2    Melucci, M.3
  • 19
    • 85143523004 scopus 로고    scopus 로고
    • "Automatic retrieval and clustering of similar words"
    • D. Lin, "Automatic retrieval and clustering of similar words," in Proc. l7th Int. Conf. Comput. Linguist., 1998, pp. 768-774.
    • (1998) Proc. 17th Int. Conf. Comput. Linguist. , pp. 768-774
    • Lin, D.1
  • 21
    • 85024373635 scopus 로고    scopus 로고
    • "A re-examination of text categorization methods"
    • in Berkeley, CA
    • Y. Yang, "A re-examination of text categorization methods," in Proc. 22nd ACM SIGIR Conf., Berkeley, CA, 1999, pp. 42-49.
    • (1999) Proc. 22nd ACM SIGIR Conf. , pp. 42-49
    • Yang, Y.1
  • 22
    • 0029206077 scopus 로고
    • "Little words can make a big difference for text classification"
    • in Seattle, WA
    • E. Riloff, "Little words can make a big difference for text classification," in Proc. l8th ACM SIGIR Conf., Seattle, WA, 1995, pp. 130-136.
    • (1995) Proc. 18th ACM SIGIR Conf. , pp. 130-136
    • Riloff, E.1
  • 23
    • 26944479877 scopus 로고    scopus 로고
    • "Comparing feature sets for learning text categorization"
    • in Paris, France
    • M. Spitters, "Comparing feature sets for learning text categorization," in Proc. RIAO, Paris, France, 2000, pp. 1124-1135.
    • (2000) Proc. RIAO , pp. 1124-1135
    • Spitters, M.1
  • 24
    • 34047099888 scopus 로고    scopus 로고
    • "A comparison of techniques for classification and ad hoc retrieval of biomedical documents"
    • in Gaithersburg, MD, [Online]. Available
    • A. M. Cohen, J. Yang, and W. R. Hersh, "A comparison of techniques for classification and ad hoc retrieval of biomedical documents," in Proc. l4th Annu. Text REtrieval Conf., Gaithersburg, MD, 2005. [Online]. Available: http://trec.nist.gov/pubs/trec14/papers/ohsu-geo.pdf
    • (2005) Proc. 14th Annu. Text REtrieval Conf.
    • Cohen, A.M.1    Yang, J.2    Hersh, W.R.3
  • 29
    • 34047104554 scopus 로고    scopus 로고
    • [Online]. Available
    • The 20 News Groups (20NG) data set. [Online]. Available: http://kdd.ics.uci.edu/databases/20newsgroups/20newsgroups.html
    • The 20 News Groups (20NG) Data Set
  • 30
    • 34047179024 scopus 로고    scopus 로고
    • [Online]. Available
    • The WebKB data set. [Online]. Available: http://www-2.cs.cmu.edu/afs/ cs.cmu.edu/project/theo-20/www/data/
    • The WebKB Data Set
  • 31
    • 0002565067 scopus 로고
    • "Overview of the Third Text Retrieval Conference"
    • D. Harman, "Overview of the Third Text Retrieval Conference," in Proc. 3rd TREC-3, 1995, pp. 1-20.
    • (1995) Proc. 3rd TREC-3 , pp. 1-20
    • Harman, D.1
  • 32
    • 0030216658 scopus 로고    scopus 로고
    • "Method for evaluation of stemming algorithms based on error counting"
    • C. D. Paice, "Method for evaluation of stemming algorithms based on error counting," J. Amer. Soc. Inf. Sci., vol. 47, no. 8, pp. 632-649, 1996.
    • (1996) J. Amer. Soc. Inf. Sci. , vol.47 , Issue.8 , pp. 632-649
    • Paice, C.D.1
  • 35
    • 84957069091 scopus 로고    scopus 로고
    • "Naive (Bayes) at forty: The independence assumption in information retrieval"
    • in Chemnitz, Germany
    • D. D. Lewis, "Naive (Bayes) at forty: The independence assumption in information retrieval," in Proc. l0th Eur Conf. Mach.Learn., Chemnitz, Germany, 1998, pp. 4-15.
    • (1998) Proc. 10th Eur. Conf. Mach. Learn. , pp. 4-15
    • Lewis, D.D.1
  • 37
    • 84957069814 scopus 로고    scopus 로고
    • "Text categorization with support vector machines: Learning with many relevant features"
    • in Chemnitz, Germany
    • T. Joachims, "Text categorization with support vector machines: Learning with many relevant features," in Proc. l0th Eur. Conf. Mach. Learn., Chemnitz, Germany, 1998, pp. 137-142.
    • (1998) Proc. 10th Eur. Conf. Mach. Learn. , pp. 137-142
    • Joachims, T.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.