메뉴 건너뛰기




Volumn 181, Issue 1, 2011, Pages 163-183

Integrating unsupervised and supervised word segmentation: The role of goodness measures

Author keywords

Accessor variety; Boundary entropy; Character tagging; Chinese word segmentation; Conditional random fields; Description length gain; Unknown word detection; Unsupervised segmentation

Indexed keywords

ACCESSOR VARIETY; BOUNDARY ENTROPY; CHARACTER TAGGING; CHINESE WORD SEGMENTATION; CONDITIONAL RANDOM FIELD; DESCRIPTION LENGTH GAIN; UNKNOWN WORD DETECTION; UNSUPERVISED SEGMENTATION;

EID: 77958083019     PISSN: 00200255     EISSN: None     Source Type: Journal    
DOI: 10.1016/j.ins.2010.09.008     Document Type: Article
Times cited : (55)

References (60)
  • 6
    • 2142787936 scopus 로고    scopus 로고
    • Accessor variety criteria for Chinese word extraction
    • H. Feng, K. Chen, X. Deng, and W. Zheng Accessor variety criteria for Chinese word extraction Computational Linguistics 30 1 2004 75 93
    • (2004) Computational Linguistics , vol.30 , Issue.1 , pp. 75-93
    • Feng, H.1    Chen, K.2    Deng, X.3    Zheng, W.4
  • 7
    • 26444614686 scopus 로고    scopus 로고
    • Unsupervised segmentation of Chinese corpus using accessor variety
    • Natural Language Processing - IJCNLP 2004
    • H. Feng, K. Chen, C. Kit, and X. Deng Unsupervised segmentation of Chinese corpus using accessor variety K.-Y. Su, J. Tsujii, J.H. Lee, O.Y. Kwong, Natural Language Processing - IJCNLP 2004 LNAI vol. 3248 2005 Springer 694 703
    • (2005) LNAI , vol.3248 , pp. 694-703
    • Feng, H.1    Chen, K.2    Kit, C.3    Deng, X.4
  • 9
    • 39749156980 scopus 로고    scopus 로고
    • Chinese word segmentation as morpheme-based lexical chunking
    • G.-H. Fu, C. Kit, and J.J. Webster Chinese word segmentation as morpheme-based lexical chunking Information Sciences 178 9 2008 2282 2296
    • (2008) Information Sciences , vol.178 , Issue.9 , pp. 2282-2296
    • Fu, G.-H.1    Kit, C.2    Webster, J.J.3
  • 13
    • 0001074490 scopus 로고
    • From phoneme to morpheme
    • Z.S. Harris From phoneme to morpheme Language 31 2 1955 90 222
    • (1955) Language , vol.31 , Issue.2 , pp. 90-222
    • Harris, Z.S.1
  • 18
    • 84860537772 scopus 로고    scopus 로고
    • Semi-supervised conditional random fields for improved sequence segmentation and labeling
    • Sydney, Australia
    • F. Jiao, S. Wang, C.-H. Lee, R. Greiner, D. Schuurmans, Semi-supervised conditional random fields for improved sequence segmentation and labeling, in: COLING/ACL-2006, Sydney, Australia, 2006, pp. 209-216.
    • (2006) COLING/ACL-2006 , pp. 209-216
    • Jiao, F.1    Wang, S.2    Lee, C.-H.3    Greiner, R.4    Schuurmans, D.5
  • 19
    • 85092217204 scopus 로고    scopus 로고
    • Unsupervised segmentation of Chinese text by use of branching entropy
    • Sidney, Australia
    • Z. Jin, K. Tanaka-Ishii, Unsupervised segmentation of Chinese text by use of branching entropy, in: COLING/ACL 2006, Sidney, Australia, 2006, pp. 428-435.
    • (2006) COLING/ACL 2006 , pp. 428-435
    • Jin, Z.1    Tanaka-Ishii, K.2
  • 21
    • 85090807987 scopus 로고    scopus 로고
    • Unsupervised learning of word boundary with description length gain
    • Osborne, M., Sang, E.T.K. (Eds.) Bergen, Norway
    • C. Kit, Y. Wilks, Unsupervised learning of word boundary with description length gain, in: Osborne, M., Sang, E.T.K. (Eds.), Computational Natural Language Learning (CoNLL-99), Bergen, Norway, 1999, pp. 1-6.
    • (1999) Computational Natural Language Learning (CoNLL-99) , pp. 1-6
    • Kit, C.1    Wilks, Y.2
  • 24
    • 85119995698 scopus 로고    scopus 로고
    • The third international Chinese language processing bakeoff: Word segmentation and named entity recognition
    • Sydney, Australia
    • G.-A. Levow, The third international Chinese language processing bakeoff: Word segmentation and named entity recognition, in: Proceedings of the Fifth SIGHAN Workshop on Chinese Language Processing (SIGHAN-5), Sydney, Australia, 2006, pp. 108-117.
    • (2006) Proceedings of the Fifth SIGHAN Workshop on Chinese Language Processing (SIGHAN-5) , pp. 108-117
    • Levow, G.-A.1
  • 27
    • 26444582893 scopus 로고    scopus 로고
    • Statistical substring reduction in linear time
    • Natural Language Processing - IJCNLP 2004
    • X. Lü, L. Zhang, and J. Hu Statistical substring reduction in linear time K.-Y. Su, J. Tsujii, J.H. Lee, O.Y. Kwong, Natural Language Processing - IJCNLP 2004 LNAI vol. 3248 2005 Springer 320 327
    • (2005) LNAI , vol.3248 , pp. 320-327
    • Lü, X.1    Zhang, L.2    Hu, J.3
  • 30
    • 0345191309 scopus 로고    scopus 로고
    • Tokenisation and sentence segmentation
    • D.D. Palmer Tokenisation and sentence segmentation R. Dale, H. Moisl, H. Somers, Handbook of Natural Language Processing 2000 Marcel Dekker New York 11 36
    • (2000) Handbook of Natural Language Processing , pp. 11-36
    • Palmer, D.D.1
  • 34
    • 0040261510 scopus 로고    scopus 로고
    • USeg: A retargetable word segmentation procedure for information retrieval
    • Technical Report TR96-2, University of Massachusetts, Amherst, MA
    • J.M. Ponte, W.B. Croft, USeg: A retargetable word segmentation procedure for information retrieval, Presented at the Symposium on Document Analysis and Information Retrieval'96 (SDAIR),Technical Report TR96-2, University of Massachusetts, Amherst, MA, 1996.
    • (1996) Symposium on Document Analysis and Information Retrieval'96 (SDAIR)
    • Ponte, J.M.1    Croft, W.B.2
  • 35
    • 38049043319 scopus 로고    scopus 로고
    • A systematic cross-comparison of sequence classifiers
    • Bethesda, Maryland
    • B. Rosenfeld, R. Feldman, M. Fresko, A systematic cross-comparison of sequence classifiers, in: SDM 2006, Bethesda, Maryland, pp. 563-567.
    • SDM 2006 , pp. 563-567
    • Rosenfeld, B.1    Feldman, R.2    Fresko, M.3
  • 36
    • 84856043672 scopus 로고
    • A mathematical theory of communication
    • C.E. Shannon A mathematical theory of communication The Bell System Technical Journal 27 1948 379 423 623-656
    • (1948) The Bell System Technical Journal , vol.27 , pp. 379-423
    • Shannon, C.E.1
  • 39
    • 84872841506 scopus 로고    scopus 로고
    • Chinese word segmentation without using lexicon and hand-crafted training data
    • Montreal, Quebec, Canada
    • M. Sun, D. Shen, B.K. Tsou, Chinese word segmentation without using lexicon and hand-crafted training data, in: COLING-ACL'98, vol. 2, Montreal, Quebec, Canada, 1998, pp. 1265-1271.
    • (1998) COLING-ACL'98 , vol.2 , pp. 1265-1271
    • Sun, M.1    Shen, D.2    Tsou, B.K.3
  • 40
    • 3142751938 scopus 로고    scopus 로고
    • Chinese word segmentation without using dictionary based on unsupervised learning strategy
    • M. Sun, M. Xiao, and B.K. Tsou Chinese word segmentation without using dictionary based on unsupervised learning strategy Chinese Journal of Computers 27 6 2004 736 742
    • (2004) Chinese Journal of Computers , vol.27 , Issue.6 , pp. 736-742
    • Sun, M.1    Xiao, M.2    Tsou, B.K.3
  • 42
    • 0001277731 scopus 로고    scopus 로고
    • A compression-based algorithm for Chinese word segmentation
    • W.J. Teahan, Y. Wen, R. McNab, and I.H. Witten A compression-based algorithm for Chinese word segmentation Computational Linguistics 26 3 2000 375 393
    • (2000) Computational Linguistics , vol.26 , Issue.3 , pp. 375-393
    • Teahan, W.J.1    Wen, Y.2    McNab, R.3    Witten, I.H.4
  • 48
    • 55549127511 scopus 로고    scopus 로고
    • Minimum tag error for discriminative training of conditional random fields
    • Y. Xiong, J. Zhu, H. Huang, and H. Xu Minimum tag error for discriminative training of conditional random fields Information Sciences 179 1-2 2009 169 179
    • (2009) Information Sciences , vol.179 , Issue.12 , pp. 169-179
    • Xiong, Y.1    Zhu, J.2    Huang, H.3    Xu, H.4
  • 50
    • 2142726570 scopus 로고    scopus 로고
    • Extraction of Chinese compound words - An experimental study on a very large corpus
    • Hong Kong, China
    • J. Zhang, J. Gao, M. Zhou, Extraction of Chinese compound words - An experimental study on a very large corpus, in: Proceedings of the Second Chinese Language Processing Workshop, Hong Kong, China, 2000, pp. 132-139.
    • (2000) Proceedings of the Second Chinese Language Processing Workshop , pp. 132-139
    • Zhang, J.1    Gao, J.2    Zhou, M.3
  • 55
    • 84871054273 scopus 로고    scopus 로고
    • Unsupervised segmentation helps supervised learning of character tagging for word segmentation and named entity recognition
    • Hyderabad, India
    • H. Zhao, C. Kit, Unsupervised segmentation helps supervised learning of character tagging for word segmentation and named entity recognition, in: The Sixth SIGHAN Workshop on Chinese Language Processing (SIGHAN-6), Hyderabad, India, 2008, pp. 106-111.
    • (2008) The Sixth SIGHAN Workshop on Chinese Language Processing (SIGHAN-6) , pp. 106-111
    • Zhao, H.1    Kit, C.2
  • 56
    • 70450183849 scopus 로고    scopus 로고
    • An empirical comparison of goodness measures for unsupervised Chinese word segmentation with a unified framework
    • Hyderabad, India
    • H. Zhao, C. Kit, An empirical comparison of goodness measures for unsupervised Chinese word segmentation with a unified framework, in: The Third International Joint Conference on Natural Language Processing (IJCNLP-2008), vol. 1, Hyderabad, India, 2008, pp. 9-16.
    • (2008) The Third International Joint Conference on Natural Language Processing (IJCNLP-2008) , vol.1 , pp. 9-16
    • Zhao, H.1    Kit, C.2
  • 58
    • 49349140728 scopus 로고    scopus 로고
    • Scaling conditional random fields by one-against-the-other decomposition
    • H. Zhao, and C. Kit Scaling conditional random fields by one-against-the-other decomposition Journal of Computer Science and Technology 23 4 2008 612 619
    • (2008) Journal of Computer Science and Technology , vol.23 , Issue.4 , pp. 612-619
    • Zhao, H.1    Kit, C.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.