메뉴 건너뛰기




Volumn 1, Issue 3, 2002, Pages 225-268

A Comparison of Chinese Document Indexing Strategies and Retrieval Models

Author keywords

Algorithms; Chinese information retrieval; comparison; Experimentation; indexing strategies; Languages; Performance

Indexed keywords


EID: 34247334591     PISSN: 15300226     EISSN: 15583430     Source Type: Journal    
DOI: 10.1145/772755.772758     Document Type: Article
Times cited : (21)

References (60)
  • 9
    • 3042735277 scopus 로고    scopus 로고
    • English-Chinese cross-language IR using bilingual dictionaries
    • (Gaithersburg, MD)
    • Chen, A., Jiang, H., and Gey, F. C. 2000. English-Chinese cross-language IR using bilingual dictionaries. In Proceedings of the TREC-9 Conference (Gaithersburg, MD), 15-21
    • (2000) Proceedings of the TREC-9 Conference , pp. 15-21
    • Chen, A.1    Jiang, H.2    Gey, F.C.3
  • 10
    • 85025412198 scopus 로고    scopus 로고
    • A model-based signature file approach for full-text retrieval of Chinese document databases
    • Chien, L.-F. 1994. A model-based signature file approach for full-text retrieval of Chinese document databases. Comput. Process. Chinese Oriental Lang. 8, Supplement, 59-76
    • Comput. Process. Chinese Oriental Lang , vol.8 , pp. 59-76
    • Chien, L.-F.1
  • 12
    • 85025409979 scopus 로고
    • GB13715: Information processing based on modern Chinese word segmentation
    • Tsinghua University Press, Beijing
    • Chinese National Standards. 1994. GB13715: Information processing based on modern Chinese word segmentation. Tsinghua University Press, Beijing
    • (1994)
  • 14
    • 26644448684 scopus 로고
    • Probabilistic retrieval in the TIPSTER collections: an application of staged logistic regression
    • Proceedings of the TREC-1 Conference (Gaithersburg, MD)
    • Cooper, W., Gey, F. C., and Chen, A. 1992. Probabilistic retrieval in the TIPSTER collections: an application of staged logistic regression. In Proceedings of the TREC-1 Conference (Gaithersburg, MD), 73-88
    • (1992) , pp. 73-88
    • Cooper, W.1    Gey, F.C.2    Chen, A.3
  • 15
    • 0013181948 scopus 로고
    • Full text retrieval based on probabilistic equations with coefficients fitted by logistic regression
    • (Gaithersburg, MD)
    • Cooper, W. S., Chen, A., and Gey, F. C. 1994. Full text retrieval based on probabilistic equations with coefficients fitted by logistic regression. In Proceedings of theTREC-2 Conference (Gaithersburg, MD), 57-66
    • (1994) Proceedings of theTREC-2 Conference , pp. 57-66
    • Cooper, W.S.1    Chen, A.2    Gey, F.C.3
  • 16
    • 0018711255 scopus 로고
    • Using probabilistic models of document retrieval without relevance
    • Croft, W. B., Adharper, D. J. 1979. Using probabilistic models of document retrieval without relevance. J. Doc. 35, 285-295
    • (1979) J. Doc. , vol.35 , pp. 285-295
    • Croft, W.B.1    Adharper, D.J.2
  • 19
    • 0010225742 scopus 로고    scopus 로고
    • Phrase discovery for English and cross-language retrieval at TREC-6
    • (Gaithersburg, MD)
    • Gey, F. C. Adchen, A. 1997. Phrase discovery for English and cross-language retrieval at TREC-6. In Proceedings of theTREC-6 Conference (Gaithersburg, MD), 637-648
    • (1997) Proceedings of theTREC-6 Conference , pp. 637-648
    • Gey, F.C.1    Adchen, A.2
  • 20
    • 0342532579 scopus 로고    scopus 로고
    • Berkeley Chinese information retrieval at TREC-5: technical report
    • (Gaithersburg, MD)
    • He, J., L. Xu, Chen, A., Meggs, J., and Gey, F. C. 1996. Berkeley Chinese information retrieval at TREC-5: technical report. In Proceedings of the TREC-5 Conference (Gaithersburg, MD), 181-186
    • (1996) Proceedings of the TREC-5 Conference , pp. 181-186
    • He, J.1    Xu, L.2    Chen, A.3    Meggs, J.4    Gey, F.C.5
  • 22
    • 0013253788 scopus 로고    scopus 로고
    • Okapi Chinese text retrieval experiments at TREC-6
    • Proceedings of the TREC-6 Conference (Gaithersburg, MD)
    • Huang, X., and Robertson, S. E. 1997. Okapi Chinese text retrieval experiments at TREC-6. In Proceedings of the TREC-6 Conference (Gaithersburg, MD), 137-142
    • (1997) , pp. 137-142
    • Huang, X.1    Robertson, S.E.2
  • 23
    • 10644233472 scopus 로고    scopus 로고
    • TREC-9 CLIR at CUHK: Disambiguation by similarity values between adjacent words
    • (Gaithersburg, MD)
    • Jin, H., and Wong, K. F. 2000. TREC-9 CLIR at CUHK: Disambiguation by similarity values between adjacent words. In Proceedings of the TREC-9 Conference (Gaithersburg, MD), 151-156
    • (2000) Proceedings of the TREC-9 Conference , pp. 151-156
    • Jin, H.1    Wong, K.F.2
  • 24
    • 85025390428 scopus 로고    scopus 로고
    • NTCIR (NII-NACSIS test collection for IR systems)
    • Kando, N. 2002. NTCIR (NII-NACSIS test collection for IR systems). Project NTCIR Home
    • (2002) Project NTCIR Home
    • Kando, N.1
  • 25
    • 0008585418 scopus 로고
    • On methods of Chinese automatic word segmentation
    • Kit, C., Liu, Y., and Liang, N. 1989. On methods of Chinese automatic word segmentation. J. Chinese Inf. Process. 3, 1, 13-20
    • (1989) J. Chinese Inf. Process. , vol.3 , Issue.1 , pp. 13-20
    • Kit, C.1    Liu, Y.2    Liang, N.3
  • 26
    • 0029343048 scopus 로고    scopus 로고
    • A network approach to probabilistic information retrieval
    • Kwok, K.L. 1996. A network approach to probabilistic information retrieval. ACM Trans. Inf. Syst. 13, 325-353
    • ACM Trans. Inf. Syst. , vol.13 , pp. 325-353
    • Kwok1
  • 27
    • 0030650235 scopus 로고    scopus 로고
    • Comparing representations in Chinese information retrieval
    • Proceedings of the 20th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (Philadelphia, PA)
    • Kwok, K. L. 1997. Comparing representations in Chinese information retrieval. In Proceedings of the 20th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (Philadelphia, PA), 34-41
    • (1997) , pp. 34-41
    • Kwok, K.L.1
  • 28
    • 0032631230 scopus 로고    scopus 로고
    • Employing multiple representations for Chinese information retrieval
    • Kwok, K. L. 1999. Employing multiple representations for Chinese information retrieval. J. Am. Soc. Inf. Sci. 50, 8, 709-723
    • (1999) J. Am. Soc. Inf. Sci. , vol.50 , Issue.8 , pp. 709-723
    • Kwok, K.L.1
  • 30
    • 0344273640 scopus 로고    scopus 로고
    • TREC-5 English and Chinese retrieval experiments using Pircs
    • Proceedings of the TREC-5 Conference (Gaithersburg, MD)
    • Kwok, K. L. and Grunfield, L. 1996. TREC-5 English and Chinese retrieval experiments using Pircs. In Proceedings of the TREC-5 Conference (Gaithersburg, MD), 1-10
    • (1996) , pp. 1-10
    • Kwok, K.L.1    Grunfield, L.2
  • 31
    • 0012931684 scopus 로고    scopus 로고
    • TREC-9 cross language, web and question-answering track experiments using Pircs
    • (Gaithersburg, MD)
    • Kwok, K. L., Grunfeld, L., Dinstl, N., and Chan, M. 2000. TREC-9 cross language, web and question-answering track experiments using Pircs. In Proceedings of the TREC-9 Conference (Gaithersburg, MD), 419-429
    • (2000) Proceedings of the TREC-9 Conference , pp. 419-429
    • Kwok, K.L.1    Grunfeld, L.2    Dinstl, N.3    Chan, M.4
  • 32
    • 0008633134 scopus 로고    scopus 로고
    • TREC-6 English and Chinese retrieval experiments using Pircs
    • (Gaithersburg, MD)
    • Kwok, K. L., Grunfield, L., and Xu, J. H. 1997. TREC-6 English and Chinese retrieval experiments using Pircs. In Proceedings of the TREC-6 Conference (Gaithersburg, MD), 207-214
    • (1997) Proceedings of the TREC-6 Conference , pp. 207-214
    • Kwok, K.L.1    Grunfield, L.2    Xu, J.H.3
  • 33
    • 0035338773 scopus 로고    scopus 로고
    • Chinese document indexing based on a new partitioned signature file: model and evaluation
    • Lam, W., Wong, K.F., and Wong, C.-Y. 2001. Chinese document indexing based on a new partitioned signature file: model and evaluation. J. Am. Soc. Inf. Sci. 52, 7, 584-597
    • (2001) J. Am. Soc. Inf. Sci. , vol.52 , Issue.7 , pp. 584-597
    • Lam, W.1    Wong, K.F.2    Wong, C.-Y.3
  • 34
    • 0342739951 scopus 로고    scopus 로고
    • Performance evaluation of character-, word- and n-gram-based indexing for Chinese text retrieval
    • Proceedings of the Information Retrieval with Asian Languages 97 Conference (Ibaraki-ken, Japan)
    • Lam, W., Wong, C. -Y., and Wong, K. F. 1997. Performance evaluation of character-, word- and n-gram-based indexing for Chinese text retrieval. In Proceedings of the Information Retrieval with Asian Languages 97 Conference (Ibaraki-ken, Japan), 68-80
    • (1997) , pp. 68-80
    • Lam, W.1    Wong, C.-Y.2    Wong, K.F.3
  • 35
    • 0033330466 scopus 로고    scopus 로고
    • Chinese information retrieval: using characters or words
    • Lee, C. -F. 1997. Chinese information retrieval: using characters or words. Inf. Process. Manage. 35. 443-462
    • (1997) Inf. Process. Manage. 35 , pp. 443-462
    • Lee, C.-F.1
  • 36
    • 0031095329 scopus 로고    scopus 로고
    • Document ranking and the vector-space model
    • Lee, D. L., Chuang, H., and Seamons, K. 1997. Document ranking and the vector-space model. IEEE Software 14, 2, 67-75
    • (1997) IEEE Software , vol.14 , Issue.2 , pp. 67-75
    • Lee, D.L.1    Chuang, H.2    Seamons, K.3
  • 37
    • 0011327277 scopus 로고    scopus 로고
    • Preliminary qualitative analysis of segmented vs bigram indexing in Chinese
    • Proceedings of the TREC-6 Conference (Gaithersburg, MD, Nov.)
    • Leong, M. -K. and Zhou, H. 1997. Preliminary qualitative analysis of segmented vs bigram indexing in Chinese. In Proceedings of the TREC-6 Conference (Gaithersburg, MD, Nov.), 19-21
    • (1997) , pp. 19-21
    • Leong, M.-K.1    Zhou, H.2
  • 43
    • 84959888592 scopus 로고    scopus 로고
    • Between terms and words for European language IR and between words and bigrams for Chinese IR
    • (Gaithersburg, MD)
    • Nie, J-Y., Chevallet, J. -P., and Bruandet, M. -F. 1997. Between terms and words for European language IR and between words and bigrams for Chinese IR. In Proceedings of the TREC-6 Conference (Gaithersburg, MD), 697-710
    • (1997) Proceedings of the TREC-6 Conference , pp. 697-710
    • Nie, J-Y.1    Chevallet, J.-P.2    Bruandet, M.-F.3
  • 45
    • 0033330466 scopus 로고    scopus 로고
    • Chinese information retrieval: using characters or words
    • Nie, J-Y., and Ren, F. 1997. Chinese information retrieval: using characters or words. Inf. Process. Manage. 35, 443-462
    • (1997) Inf. Process. Manage. , vol.35 , pp. 443-462
    • Nie, J-Y.1    Ren, F.2
  • 46
    • 0016958419 scopus 로고    scopus 로고
    • Relevance weighting of search terms
    • Robertson, S. E. and Sparck Jones, K. 1976. Relevance weighting of search terms. J. Am. Soc. Inf. Sci. 27, 129-146
    • J. Am. Soc. Inf. Sci. , vol.27 , pp. 129-146
    • Robertson, S.E.1    Jones, S.2
  • 47
    • 84966534942 scopus 로고
    • Some simple effe ctive approximations to the 2-Poisson model for probabilistic weighted retrieval
    • Proceedings of the 15t’’ Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (Philadelphia, PA)
    • Robertson, S. E., and Walker, S. 1992. Some simple effe ctive approximations to the 2-Poisson model for probabilistic weighted retrieval. In Proceedings of the 15t’’ Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (Philadelphia, PA), 232-241
    • (1992) , pp. 232-241
    • Robertson, S.E.1    Walker, S.2
  • 48
    • 1542310212 scopus 로고
    • Okapi at TREC-2
    • Proceedings of the TREC-2 Conference (Gaithersburg, MD}
    • Robertson, S., Walker, S., Jones, S., Hancock-Beaulieu, M., and Gatford, M. 1993. Okapi at TREC-2. In Proceedings of the TREC-2 Conference (Gaithersburg, MD}, 21-25
    • (1993) , pp. 21-25
    • Robertson, S.1    Walker, S.2    Jones, S.3    Hancock-Beaulieu, M.4    Gatford, M.5
  • 49
    • 0001319911 scopus 로고
    • Okapi at TREC- 3
    • Proceedings of the TREC-3 Conference (Gaithersburg, MD)
    • Robertson, S. E., Walker, S., Jones, S., Hancock-Beaulieu, M., and Gatford, M. 1994. Okapi at TREC- 3. In Proceedings of the TREC-3 Conference (Gaithersburg, MD), 109-128
    • (1994) , pp. 109-128
    • Robertson, S.E.1    Walker, S.2    Jones, S.3    Hancock-Beaulieu, M.4    Gatford, M.5
  • 50
    • 45549117987 scopus 로고    scopus 로고
    • Term-weighting approaches in automatic text retrieval
    • Salton, G. and Buckley, C. 1998. Term-weighting approaches in automatic text retrieval. Inf. Process. Manage. 24, 5, 513-523
    • (1998) Inf. Process. Manage. , vol.24 , Issue.5 , pp. 513-523
    • Salton, G.1    Buckley, C.2
  • 52
    • 0036497062 scopus 로고    scopus 로고
    • From e-sex to e-commerce: web search changes
    • Spink, A., Jansen, B. J., Wolfram, D., and Sap. Acevic, T. 2002. From e-sex to e-commerce: web search changes. IEEE Computer 35, 3, 107-109
    • IEEE Computer , vol.35 , Issue.3 , pp. 107-109
    • Spink, A.1    Jansen, B.J.2    Wolfram, D.3
  • 53
    • 0001465757 scopus 로고
    • A statistical method for finding word boundaries in Chinese text
    • Sproat, R. and Shih, C. 1990. A statistical method for finding word boundaries in Chinese text. Comput. Process. Chinese Oriental Lang. 4, 4, 336-351
    • (1990) Comput. Process. Chinese Oriental Lang. , vol.4 , Issue.4 , pp. 336-351
    • Sproat, R.1    Shih, C.2
  • 57
    • 0003756969 scopus 로고    scopus 로고
    • Managing Gigabytes: Compressing and Indexing Documents and Images
    • Morgan Kaufmann, San Francisco, CA
    • Witten, I. H., Moffat, A., and Bell, T. C. 1999. Managing Gigabytes: Compressing and Indexing Documents and Images. Morgan Kaufmann, San Francisco, CA
    • (1999)
    • Witten, I.H.1    Moffat, A.2    Bell, T.C.3
  • 58
    • 85100864264 scopus 로고
    • Improving Chinese tokenization with linguistic filters on statistical lexical acquisition
    • (Stuttgart, Germany)
    • Wu, D. and Fung, P. 1994. Improving Chinese tokenization with linguistic filters on statistical lexical acquisition, In Proceedings of the 4“’ Conference on Applied Natural Language Processing (Stuttgart, Germany), 180-181
    • (1994) Proceedings of the 4“’ Conference on Applied Natural Language Processing , pp. 180-181
    • Wu, D.1    Fung, P.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.