메뉴 건너뛰기




Volumn 178, Issue 9, 2008, Pages 2282-2296

Chinese word segmentation as morpheme-based lexical chunking

Author keywords

Chinese word segmentation; Lexical chunking; Lexicalized hidden Markov models; Morpheme segmentation

Indexed keywords

DATA MINING; HIDDEN MARKOV MODELS; NATURAL LANGUAGE PROCESSING SYSTEMS; TEXT PROCESSING;

EID: 39749156980     PISSN: 00200255     EISSN: None     Source Type: Journal    
DOI: 10.1016/j.ins.2008.01.001     Document Type: Article
Times cited : (45)

References (35)
  • 1
    • 39749101890 scopus 로고    scopus 로고
    • R.H. Baayen, A corpus-based study of morphological productivity: Statistical analysis and psychological interpretation, Ph.D. Thesis, Free University, Amsterdam, 1989.
    • R.H. Baayen, A corpus-based study of morphological productivity: Statistical analysis and psychological interpretation, Ph.D. Thesis, Free University, Amsterdam, 1989.
  • 2
    • 39749100175 scopus 로고    scopus 로고
    • T. Emerson, The second international Chinese word segmentation bakeoff, in: Proceedings of the 4th SIGHAN Workshop on Chinese Language Processing, Jeju Island, Korea, 2005, pp. 123-133.
    • T. Emerson, The second international Chinese word segmentation bakeoff, in: Proceedings of the 4th SIGHAN Workshop on Chinese Language Processing, Jeju Island, Korea, 2005, pp. 123-133.
  • 3
    • 0344496050 scopus 로고    scopus 로고
    • Chinese word segmentation and its effect on information retrieval
    • Foo S., and Li H. Chinese word segmentation and its effect on information retrieval. Information Processing and Management 40 1 (2004) 161-190
    • (2004) Information Processing and Management , vol.40 , Issue.1 , pp. 161-190
    • Foo, S.1    Li, H.2
  • 4
    • 6344285863 scopus 로고    scopus 로고
    • G. Fu, K.-K. Luke, Chinese unknown word identification as known word tagging, in: Proceedings of the 2004 IEEE International Conference on Machine Learning and Cybernetics (ICMLC 2004), Shanghai, China, 2004, pp. 2612-2617.
    • G. Fu, K.-K. Luke, Chinese unknown word identification as known word tagging, in: Proceedings of the 2004 IEEE International Conference on Machine Learning and Cybernetics (ICMLC 2004), Shanghai, China, 2004, pp. 2612-2617.
  • 5
    • 39749142885 scopus 로고    scopus 로고
    • Chinese named entity recognition using lexicalized HMMs
    • Fu G., and Luke K.-K. Chinese named entity recognition using lexicalized HMMs. ACM SIGKDD Explorations Newsletter 7 1 (2005) 19-25
    • (2005) ACM SIGKDD Explorations Newsletter , vol.7 , Issue.1 , pp. 19-25
    • Fu, G.1    Luke, K.-K.2
  • 6
    • 39749172618 scopus 로고    scopus 로고
    • G. Fu, X. Wang, Unsupervised Chinese word segmentation and unknown word identification, in: Proceedings of the 5th Natural Language Processing Pacific Rim Symposium (NLPRS'99), Beijing, China, 1999, pp. 32-37.
    • G. Fu, X. Wang, Unsupervised Chinese word segmentation and unknown word identification, in: Proceedings of the 5th Natural Language Processing Pacific Rim Symposium (NLPRS'99), Beijing, China, 1999, pp. 32-37.
  • 7
    • 33646401779 scopus 로고    scopus 로고
    • Chinese word segmentation and named entity recognition: a pragmatic approach
    • Gao J., Li M., Wu A., and Huang C.-N. Chinese word segmentation and named entity recognition: a pragmatic approach. Computational Linguistics 31 4 (2006) 531-574
    • (2006) Computational Linguistics , vol.31 , Issue.4 , pp. 531-574
    • Gao, J.1    Li, M.2    Wu, A.3    Huang, C.-N.4
  • 9
    • 39749110880 scopus 로고    scopus 로고
    • C.-R. Huang, K.-J. Chen, L.-P. Chang, H.-L. Hsu, An introduction to Academia Sinica balanced corpus, in: Proceedings of ROCLING VIII, 1995, pp. 81-99.
    • C.-R. Huang, K.-J. Chen, L.-P. Chang, H.-L. Hsu, An introduction to Academia Sinica balanced corpus, in: Proceedings of ROCLING VIII, 1995, pp. 81-99.
  • 10
    • 39749126437 scopus 로고    scopus 로고
    • J.-D. Kim, S.-Z. Lee, H.-C. Rim, HMM specialization with selective lexicalization, in: Proceedings of the 1999 Joint SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora (EMNLP-VLC-99), 1999, pp. 121-127.
    • J.-D. Kim, S.-Z. Lee, H.-C. Rim, HMM specialization with selective lexicalization, in: Proceedings of the 1999 Joint SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora (EMNLP-VLC-99), 1999, pp. 121-127.
  • 11
    • 39749156652 scopus 로고    scopus 로고
    • S.-Z. Lee, T.-J. Tsujii, H.-C. Rim, Lexicalized hidden Markov models for part-of-speech tagging, in: Proceedings of the 18th International Conference on Computational Linguistics (COLING 2000), Saarbruken, Germany, 2000, pp. 481-487.
    • S.-Z. Lee, T.-J. Tsujii, H.-C. Rim, Lexicalized hidden Markov models for part-of-speech tagging, in: Proceedings of the 18th International Conference on Computational Linguistics (COLING 2000), Saarbruken, Germany, 2000, pp. 481-487.
  • 12
    • 39749173278 scopus 로고    scopus 로고
    • G.-A. Levow, The third international Chinese language processing bakeoff: word segmentation and named entity recognition, in: Proceedings of the 5th SIGHAN Workshop on Chinese Language Processing, Sydney, Australia, 2006, pp. 108-117.
    • G.-A. Levow, The third international Chinese language processing bakeoff: word segmentation and named entity recognition, in: Proceedings of the 5th SIGHAN Workshop on Chinese Language Processing, Sydney, Australia, 2006, pp. 108-117.
  • 13
    • 0346479270 scopus 로고
    • CDWS - a written Chinese automatic word segmentation system
    • Liang N. CDWS - a written Chinese automatic word segmentation system. Journal of Chinese Information Processing 1 2 (1987) 44-52
    • (1987) Journal of Chinese Information Processing , vol.1 , Issue.2 , pp. 44-52
    • Liang, N.1
  • 15
    • 39749194191 scopus 로고    scopus 로고
    • H.T. Ng, J.K. Low, Chinese part-of-speech tagging: one-at-a-time or all-at-once? Word-based or character-based? in: Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing (EMNLP 2004), Barcelona, Spain, 2004, pp. 277-284.
    • H.T. Ng, J.K. Low, Chinese part-of-speech tagging: one-at-a-time or all-at-once? Word-based or character-based? in: Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing (EMNLP 2004), Barcelona, Spain, 2004, pp. 277-284.
  • 16
    • 0033330466 scopus 로고    scopus 로고
    • Chinese information retrieval: using characters or words?
    • Nie J.-Y., and Ren F. Chinese information retrieval: using characters or words?. Information Processing and Management 35 4 (1999) 443-462
    • (1999) Information Processing and Management , vol.35 , Issue.4 , pp. 443-462
    • Nie, J.-Y.1    Ren, F.2
  • 18
    • 34250709079 scopus 로고    scopus 로고
    • Semantic passage segmentation based on sentence topics for question answering
    • Oh H.-J., Myaeng S.H., and Jang M.-G. Semantic passage segmentation based on sentence topics for question answering. Information Sciences 177 18 (2007) 3696-3717
    • (2007) Information Sciences , vol.177 , Issue.18 , pp. 3696-3717
    • Oh, H.-J.1    Myaeng, S.H.2    Jang, M.-G.3
  • 20
    • 39749142416 scopus 로고    scopus 로고
    • D.D. Palmer, A trainable rule-based algorithm for word segmentation, in: Proceedings of the 35th Annual Meeting of ACL and 8th Conference of the European Chapter of ACL, Madrid, Spain, 1997, pp. 321-328.
    • D.D. Palmer, A trainable rule-based algorithm for word segmentation, in: Proceedings of the 35th Annual Meeting of ACL and 8th Conference of the European Chapter of ACL, Madrid, Spain, 1997, pp. 321-328.
  • 21
    • 39749175856 scopus 로고    scopus 로고
    • F. Peng, F. Feng, A. McCallum, Chinese segmentation and new word detection using conditional random fields, in: Proceedings of the 20th International Conference on Computational Linguistics (COLING 2004), Geneva, Switzerland, 2004, pp. 562-568.
    • F. Peng, F. Feng, A. McCallum, Chinese segmentation and new word detection using conditional random fields, in: Proceedings of the 20th International Conference on Computational Linguistics (COLING 2004), Geneva, Switzerland, 2004, pp. 562-568.
  • 22
    • 39749107559 scopus 로고    scopus 로고
    • R. Sproat, T. Emerson, The first international Chinese word segmentation bakeoff, in: Proceedings of the 2nd SIGHAN Workshop on Chinese Language Processing, Sapporo, Japan, 2003, pp. 133-143.
    • R. Sproat, T. Emerson, The first international Chinese word segmentation bakeoff, in: Proceedings of the 2nd SIGHAN Workshop on Chinese Language Processing, Sapporo, Japan, 2003, pp. 133-143.
  • 23
    • 39749170742 scopus 로고    scopus 로고
    • R. Sproat, C. Shih, Corpus-based methods in Chinese morphology, in: Proceedings of the 19th International Conference on Computational Linguistics (COLING 2002), Taipei, Taiwan, 2002.
    • R. Sproat, C. Shih, Corpus-based methods in Chinese morphology, in: Proceedings of the 19th International Conference on Computational Linguistics (COLING 2002), Taipei, Taiwan, 2002.
  • 24
    • 0001277731 scopus 로고    scopus 로고
    • A compression-based algorithm for Chinese word segmentation
    • Teahan W.J., Wen Y., McNab R., and Witten I.H. A compression-based algorithm for Chinese word segmentation. Computational Linguistics 26 3 (2000) 375-393
    • (2000) Computational Linguistics , vol.26 , Issue.3 , pp. 375-393
    • Teahan, W.J.1    Wen, Y.2    McNab, R.3    Witten, I.H.4
  • 25
    • 0347109938 scopus 로고    scopus 로고
    • An intelligent full-text Chinese-English translation system
    • Tou J.T. An intelligent full-text Chinese-English translation system. Information Sciences 125 1-4 (2000) 1-18
    • (2000) Information Sciences , vol.125 , Issue.1-4 , pp. 1-18
    • Tou, J.T.1
  • 26
    • 39749109026 scopus 로고    scopus 로고
    • H. Tseng, K.-J. Chen, Design of Chinese morphological analyzer, in: Proceedings of the 1st SIGHAN Workshop on Chinese Language Processing, 2002, pp. 1-7.
    • H. Tseng, K.-J. Chen, Design of Chinese morphological analyzer, in: Proceedings of the 1st SIGHAN Workshop on Chinese Language Processing, 2002, pp. 1-7.
  • 27
    • 85044979387 scopus 로고    scopus 로고
    • A statistic study of three-character unknown words in Chinese
    • Wang Z., Zhu X., and Duan H. A statistic study of three-character unknown words in Chinese. Journal of Chinese Language and Computing 15 2 (2005) 113-123
    • (2005) Journal of Chinese Language and Computing , vol.15 , Issue.2 , pp. 113-123
    • Wang, Z.1    Zhu, X.2    Duan, H.3
  • 28
    • 39749181198 scopus 로고    scopus 로고
    • J.J. Webster, C. Kit, Tokenization as the initial phrase in NLP, in: Proceedings of the 14th International Conference on Computational Linguistics (COLING 1992), Nantes, France, 1992, pp. 1106-1110.
    • J.J. Webster, C. Kit, Tokenization as the initial phrase in NLP, in: Proceedings of the 14th International Conference on Computational Linguistics (COLING 1992), Nantes, France, 1992, pp. 1106-1110.
  • 29
    • 84982438978 scopus 로고
    • ACTS: an automatic Chinese text segmentation systems for full text retrieval
    • Wu Z., and Tseng G. ACTS: an automatic Chinese text segmentation systems for full text retrieval. Journal of the American Society for Information Science 46 2 (1995) 83-96
    • (1995) Journal of the American Society for Information Science , vol.46 , Issue.2 , pp. 83-96
    • Wu, Z.1    Tseng, G.2
  • 31
    • 85120437501 scopus 로고
    • Rule-based word identification for Mandarin Chinese sentences - a unification approach
    • Yeh C.-L., and Lee H.-J. Rule-based word identification for Mandarin Chinese sentences - a unification approach. Computer Processing of Chinese and Oriental Languages 5 2 (1991) 97-117
    • (1991) Computer Processing of Chinese and Oriental Languages , vol.5 , Issue.2 , pp. 97-117
    • Yeh, C.-L.1    Lee, H.-J.2
  • 32
    • 6344270988 scopus 로고    scopus 로고
    • Specification for corpus processing at Peking University: Word segmentation, POS tagging and phonetic notation
    • Yu S., Duan H., Zhu S., Swen B., and Chang B. Specification for corpus processing at Peking University: Word segmentation, POS tagging and phonetic notation. Journal of Chinese Language and Computing 13 2 (2003) 121-158
    • (2003) Journal of Chinese Language and Computing , vol.13 , Issue.2 , pp. 121-158
    • Yu, S.1    Duan, H.2    Zhu, S.3    Swen, B.4    Chang, B.5
  • 33
    • 39749133095 scopus 로고    scopus 로고
    • R. Zhang, G. Kikui, E. Sumita, Subword-based tagging for confidence-dependent Chinese word segmentation, in: Proceedings of the COLING/ACL 2006 Main Conference Poster Sessions, Sydney, Australia, 2006, pp. 961-968.
    • R. Zhang, G. Kikui, E. Sumita, Subword-based tagging for confidence-dependent Chinese word segmentation, in: Proceedings of the COLING/ACL 2006 Main Conference Poster Sessions, Sydney, Australia, 2006, pp. 961-968.
  • 34
    • 39749120733 scopus 로고    scopus 로고
    • H.-P. Zhang, Q. Liu, H. Zhang, X.-Q. Cheng, Automatic recognition of Chinese unknown words based on roles tagging, in: Proceedings of the 1st SIGHAN Workshop on Chinese Language Processing, Taiwan, 2002, pp. 71-77.
    • H.-P. Zhang, Q. Liu, H. Zhang, X.-Q. Cheng, Automatic recognition of Chinese unknown words based on roles tagging, in: Proceedings of the 1st SIGHAN Workshop on Chinese Language Processing, Taiwan, 2002, pp. 71-77.
  • 35
    • 2442649391 scopus 로고    scopus 로고
    • A Chinese word segmentation based on language situation in processing ambiguous words
    • Zhang M.-Y., Lu Z.-D., and Zou C.-Y. A Chinese word segmentation based on language situation in processing ambiguous words. Information Sciences 162 3-4 (2004) 275-285
    • (2004) Information Sciences , vol.162 , Issue.3-4 , pp. 275-285
    • Zhang, M.-Y.1    Lu, Z.-D.2    Zou, C.-Y.3


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.