-
1
-
-
0037956008
-
Chinese text segmentation with MBDP-I: Making the most of training corpora
-
Brent, M. R. and Tao, X. (2001) Chinese text segmentation with MBDP-I: Making the most of training corpora, Proc. ACL.
-
(2001)
Proc. ACL.
-
-
Brent, M.R.1
Tao, X.2
-
2
-
-
84936824188
-
Word association norms, mutual information, and lexicography
-
Church, K. W. and Hanks, P. (1990) Word association norms, mutual information, and lexicography. Computational Linguistics 16(1): 22-29.
-
(1990)
Computational Linguistics
, vol.16
, Issue.1
, pp. 22-29
-
-
Church, K.W.1
Hanks, P.2
-
4
-
-
85029362060
-
A new statistical formula for Chinese text segmentation incorporating contextual information
-
Dai, Y., Loh, T. E. and Khoo, C. S. G. (1999) A new statistical formula for Chinese text segmentation incorporating contextual information. Proc. 22nd SIGIR, pp. 82-89.
-
(1999)
Proc. 22nd SIGIR
, pp. 82-89
-
-
Dai, Y.1
Loh, T.E.2
Khoo, C.S.G.3
-
5
-
-
85149140829
-
Japanese morphological analyzer using word co-occurrence - JTAG
-
Fuchi, T. and Takagi, S. (1998) Japanese morphological analyzer using word co-occurrence - JTAG. Proc. COLING-ACL '98, pp. 409-413.
-
(1998)
Proc. COLING-ACL '98
, pp. 409-413
-
-
Fuchi, T.1
Takagi, S.2
-
6
-
-
1542381209
-
Extracting key terms from Chinese and Japanese texts
-
Fung, P. (1998) Extracting key terms from Chinese and Japanese texts. Computer Processing of Oriental Languages 12(1).
-
(1998)
Computer Processing of Oriental Languages
, vol.12
, Issue.1
-
-
Fung, P.1
-
7
-
-
84958571083
-
Discovering Chinese words from unsegmented text
-
(Poster abstract.)
-
Ge, X., Pratt, W. and Smyth, P. (1999) Discovering Chinese words from unsegmented text. Proc. 22nd SIGIR, pp. 271-272 (Poster abstract.)
-
(1999)
Proc. 22nd SIGIR
, pp. 271-272
-
-
Ge, X.1
Pratt, W.2
Smyth, P.3
-
8
-
-
0041079008
-
Unsupervised learning of the morphology of a natural language
-
Goldsmith, J. (1990) Unsupervised learning of the morphology of a natural language. Computational Linguistics 27(2): pp. 153-198.
-
(2001)
Computational Linguistics
, vol.27
, Issue.2
, pp. 153-198
-
-
Goldsmith, J.1
-
9
-
-
0038293492
-
Evaluating parsing strategies using standardized parse files
-
Grishman, R., Macleod, C. and Sterling, J. (1992) Evaluating parsing strategies using standardized parse files. Proc. 3rd ANLP, pp. 156-161.
-
(1992)
Proc. 3rd ANLP
, pp. 156-161
-
-
Grishman, R.1
Macleod, C.2
Sterling, J.3
-
10
-
-
0037617943
-
Language modeling by string pattern N-gram for Japanese speech recognition
-
Ito, A. and Kohda, K. (1995) Language modeling by string pattern N-gram for Japanese speech recognition. Proc. ICASSP.
-
(1995)
Proc. ICASSP.
-
-
Ito, A.1
Kohda, K.2
-
11
-
-
0038632333
-
Use of mutual information based character clusters in dictionary-less morphological analysis of Japanese
-
Kashioka, H., Kawata, Y., Kinjo, Y., Finch, A. and Black, E. W. (1998) Use of mutual information based character clusters in dictionary-less morphological analysis of Japanese. Proc. COLING-ACL '98, pp. 658-662.
-
(1998)
Proc. COLING-ACL '98
, pp. 658-662
-
-
Kashioka, H.1
Kawata, Y.2
Kinjo, Y.3
Finch, A.4
Black, E.W.5
-
12
-
-
0038293468
-
Japanese morphological analysis system JUMAN version 3.61 manual
-
In Japanese
-
Kurohashi, S. and Nagao, M. (1999) Japanese morphological analysis system JUMAN version 3.61 manual. In Japanese.
-
(1999)
-
-
Kurohashi, S.1
Nagao, M.2
-
13
-
-
0008686260
-
Experiments on the use of bigram mutual information for Chinese natural language processing
-
Lua, K.-T. (1995) Experiments on the use of bigram mutual information for Chinese natural language processing. Int. Conf. Computer Processing of Oriental Languages (ICCPOL), pp. 23-25.
-
(1995)
Int. Conf. Computer Processing of Oriental Languages (ICCPOL)
, pp. 23-25
-
-
Lua, K.-T.1
-
15
-
-
85158028663
-
Parsing a natural language using information statistics
-
Magerman, D. M. and Marcus, M. P. (1990) Parsing a natural language using information statistics. Proc. AAAI, pp. 984-989.
-
(1990)
Proc. AAAI
, pp. 984-989
-
-
Magerman, D.M.1
Marcus, M.P.2
-
16
-
-
0027681165
-
Suffix arrays: A new method for on-line string searches
-
Manber, U. and Myers, G. (1993) Suffix arrays: A new method for on-line string searches. SIAM J. Comput. 22(5): 935-948.
-
(1993)
SIAM J. Comput.
, vol.22
, Issue.5
, pp. 935-948
-
-
Manber, U.1
Myers, G.2
-
18
-
-
0004067802
-
-
Technical Report NAIST-IS-TR97007, Nara Institute of Science and Technology. In Japanese
-
Matsumoto, Y., Kitauchi, A., Yamashita, T., Hirano, Y., Imaichi, O., and Imamura, T. (1997) Japanese morphological analysis system ChaSen manual. Technical Report NAIST-IS-TR97007, Nara Institute of Science and Technology. In Japanese.
-
(1997)
Japanese Morphological Analysis System ChaSen Manual
-
-
Matsumoto, Y.1
Kitauchi, A.2
Yamashita, T.3
Hirano, Y.4
Imaichi, O.5
Imamura, T.6
-
20
-
-
0037617911
-
Unknown word extraction from corpora using n-gram statistics
-
In Japanese
-
Mori, S. and Nagao, M. (1998) Unknown word extraction from corpora using n-gram statistics. J. Infor. Process. Soc. Japan 39(7): 2093-2100. In Japanese.
-
(1998)
J. Infor. Process. Soc. Japan
, vol.39
, Issue.7
, pp. 2093-2100
-
-
Mori, S.1
Nagao, M.2
-
21
-
-
0006702241
-
A new method of N-gram statistics for large number of n and automatic extraction of words and phrases from large text data of Japanese
-
Nagao, M. and Mori, S. (1994) A new method of N-gram statistics for large number of n and automatic extraction of words and phrases from large text data of Japanese. Proc. 15th COLING, pp. 611-615.
-
(1994)
Proc. 15th COLING
, pp. 611-615
-
-
Nagao, M.1
Mori, S.2
-
22
-
-
0038293491
-
* n-best search algorithm
-
* n-best search algorithm. Proc. 15th COLING, pp. 201-207.
-
(1994)
Proc. 15th COLING
, pp. 201-207
-
-
Nagata, M.1
-
23
-
-
0037617941
-
Automatic extraction of new words from Japanese texts using generalized forward-backward search
-
Nagata, M. (1996a) Automatic extraction of new words from Japanese texts using generalized forward-backward search. Proc. Conf. Empirical Methods in Natural Language Processing. pp. 48-59.
-
(1996)
Proc. Conf. Empirical Methods in Natural Language Processing
, pp. 48-59
-
-
Nagata, M.1
-
24
-
-
0037617940
-
Context-based spelling correction for Japanese OCR
-
Nagata, M. (1996b) Context-based spelling correction for Japanese OCR. Proc. 16th COLING, pp. 806-811.
-
(1996)
Proc. 16th COLING
, pp. 806-811
-
-
Nagata, M.1
-
25
-
-
0038632330
-
A self-organizing Japanese word segmenter using heuristic word identification and re-estimation
-
Nagata, M. (1997) A self-organizing Japanese word segmenter using heuristic word identification and re-estimation. Proc. 5th Workshop on Very Large Corpora, pp. 203-215.
-
(1997)
Proc. 5th Workshop on Very Large Corpora
, pp. 203-215
-
-
Nagata, M.1
-
26
-
-
0033349349
-
Overlapping statistical segmentation for effective indexing of Japanese text
-
Ogawa, Y. and Matsuda, T. (1999) Overlapping statistical segmentation for effective indexing of Japanese text. Infor. Process. & Manage. 35, 463-480.
-
(1999)
Infor. Process. & Manage
, vol.35
, pp. 463-480
-
-
Ogawa, Y.1
Matsuda, T.2
-
27
-
-
85072855288
-
A trainable rule-based algorithm for word segmentation
-
Palmer, D. (1997) A trainable rule-based algorithm for word segmentation. Proc. 35th ACL/8th EACL, pp. 321-328.
-
(1997)
Proc. 35th ACL/8th EACL
, pp. 321-328
-
-
Palmer, D.1
-
30
-
-
0035743477
-
The use of predictive dependencies in language learning
-
Saffran, J. R. (2001) The use of predictive dependencies in language learning. J. Memory & Language 44: 493-515.
-
(2001)
J. Memory & Language
, vol.44
, pp. 493-515
-
-
Saffran, J.R.1
-
31
-
-
0030451408
-
Statistical learning by 8-month-old infants
-
Saffran, J. R., Aslin, R. N. and Newport, E. L. (1996) Statistical learning by 8-month-old infants. Science 274(5294): 1926-1928.
-
(1996)
Science
, vol.274
, Issue.5294
, pp. 1926-1928
-
-
Saffran, J.R.1
Aslin, R.N.2
Newport, E.L.3
-
33
-
-
0001076101
-
A stochastic finite-state word-segmentation algorithm for Chinese
-
Sproat, R., Shih, C., Gale, W. and Chang, N. (1996) A stochastic finite-state word-segmentation algorithm for Chinese. Computational Linguistics 22(3): 377-404.
-
(1996)
Computational Linguistics
, vol.22
, Issue.3
, pp. 377-404
-
-
Sproat, R.1
Shih, C.2
Gale, W.3
Chang, N.4
-
34
-
-
0008573508
-
Chinese word segmentation without using lexicon and hand-crafted training data
-
Sun, M., Shen, D. and Tsou, B. K. (1998) Chinese word segmentation without using lexicon and hand-crafted training data. Proc. COLING-ACL, pp. 1265-1271.
-
(1998)
Proc. COLING-ACL
, pp. 1265-1271
-
-
Sun, M.1
Shen, D.2
Tsou, B.K.3
-
35
-
-
0037617942
-
Automatic decomposition of kanji compound words using stochastic estimation
-
In Japanese
-
Takeda, K. and Fujisaki, T. (1987) Automatic decomposition of kanji compound words using stochastic estimation. J. Infor. Process. Soc. Japan 28(9): 952-961, In Japanese.
-
(1987)
J. Infor. Process. Soc. Japan
, vol.28
, Issue.9
, pp. 952-961
-
-
Takeda, K.1
Fujisaki, T.2
-
37
-
-
0001277731
-
A compression-based algorithm for Chinese word segmentation
-
Teahan, W. J., Wen, Y., McNab, R. and Witten, I. H. (2000) A compression-based algorithm for Chinese word segmentation. Computational Linguistics 26(3): 275-393.
-
(2000)
Computational Linguistics
, vol.26
, Issue.3
, pp. 275-393
-
-
Teahan, W.J.1
Wen, Y.2
McNab, R.3
Witten, I.H.4
-
38
-
-
0028587505
-
A probabilistic algorithm for segmenting non-kanji Japanese strings
-
Teller, V. and Batchelder, E. O. (1994) A probabilistic algorithm for segmenting non-kanji Japanese strings. Proc. 12th AAAI, pp. 742-747.
-
(1994)
Proc. 12th AAAI
, pp. 742-747
-
-
Teller, V.1
Batchelder, E.O.2
-
41
-
-
0038293493
-
Corpus-based automatic compound extraction with mutual information and relative frequency count
-
Wu, M.-W. and Su, K.-Y. (1993) Corpus-based automatic compound extraction with mutual information and relative frequency count. Proc. R.O.C. Computational Linguistics Conference VI, pp. 207-216.
-
(1993)
Proc. R.O.C. Computational Linguistics Conference VI
, pp. 207-216
-
-
Wu, M.-W.1
Su, K.-Y.2
-
42
-
-
84989592173
-
Chinese text segmentation for text retrieval: Achievements and problems
-
Wu, Z. and Tseng, G. (1993) Chinese text segmentation for text retrieval: Achievements and problems. J. Am. Soc. Infor. Sci. 44(9): 532-542.
-
(1993)
J. Am. Soc. Infor. Sci.
, vol.44
, Issue.9
, pp. 532-542
-
-
Wu, Z.1
Tseng, G.2
-
43
-
-
0038632297
-
A re-estimation method for stochastic language modeling from ambiguous observations
-
Yamamoto, M. (1996) A re-estimation method for stochastic language modeling from ambiguous observations. Proc. 4th Workshop on Very Large Corpora, pp. 155-167.
-
(1996)
Proc. 4th Workshop on Very Large Corpora
, pp. 155-167
-
-
Yamamoto, M.1
-
44
-
-
0038632285
-
Using suffix arrays to compute term frequency and document frequency for all substrings in a corpus
-
Yamamoto, M. and Church, K. W. (2001) Using suffix arrays to compute term frequency and document frequency for all substrings in a corpus. Computational Linguistics 27(1): 1-30.
-
(2001)
Computational Linguistics
, vol.27
, Issue.1
, pp. 1-30
-
-
Yamamoto, M.1
Church, K.W.2
-
45
-
-
0038293494
-
LINGSTAT: An interactive machine-aided translation system
-
Yamron, J., Baker, J., Bamberg, P., Chevalier, H., Dietzel, T., Elder, J., Kampmann, F., Mandel, M., Manganaro, L., Margolis, T. and Steele, E. (1993) LINGSTAT: An interactive machine-aided translation system. Proc. Human Language Technologies Workshop (HLT). pp. 191-195.
-
(1993)
Proc. Human Language Technologies Workshop (HLT)
, pp. 191-195
-
-
Yamron, J.1
Baker, J.2
Bamberg, P.3
Chevalier, H.4
Dietzel, T.5
Elder, J.6
Kampmann, F.7
Mandel, M.8
Manganaro, L.9
Margolis, T.10
Steele, E.11
|