메뉴 건너뛰기




Volumn 23, Issue 2, 2009, Pages 287-312

Using data-driven subword units in language model of highly inflective slovenian language

Author keywords

Inflective language; Speech recognition; Statistical language modeling; Subword units

Indexed keywords

COMPLEX MORPHOLOGIES; DATA SPARSITIES; DATA-DRIVEN; DATA-DRIVEN METHODS; INFLECTIVE LANGUAGE; LANGUAGE MODELING; LANGUAGE MODELS; LARGE VOCABULARY SPEECH RECOGNITION; MINIMUM ENTROPIES; N-GRAM MODELS; PRIOR KNOWLEDGE; PROBABILISTIC LANGUAGES; SLOVENIAN LANGUAGES; SPEECH DATABASE; STATISTICAL LANGUAGE MODELING; SUBWORD; SUBWORD UNITS; TEST SET PERPLEXITIES; TEST SETS; TEXT CORPORA; TRAINING CORPORA; WORD-BASED MODELS;

EID: 65249183154     PISSN: 02180014     EISSN: None     Source Type: Journal    
DOI: 10.1142/S0218001409007119     Document Type: Article
Times cited : (4)

References (32)
  • 1
    • 0025725905 scopus 로고
    • Instance-based learning algorithms
    • D. W. Aha, D, Kibler and M. Albert, Instance-based learning algorithms, Mach. Learn. 6 (1991) 37-66.
    • (1991) Mach. Learn , vol.6 , pp. 37-66
    • Aha, D.W.1    Kibler, D.2    Albert, M.3
  • 2
    • 33745683825 scopus 로고    scopus 로고
    • A unified language model for large vocabulary continuous speech recognition of Turkish
    • E. Arisoy, H. Dutagaci and L. M. Arslan, A unified language model for large vocabulary continuous speech recognition of Turkish, Sign. Process. 86(10) (2006) 2844-2862.
    • (2006) Sign. Process , vol.86 , Issue.10 , pp. 2844-2862
    • Arisoy, E.1    Dutagaci, H.2    Arslan, L.M.3
  • 3
    • 84867919822 scopus 로고
    • Transformation-based error-driven learning and natural language processing: A case study in part of speech tagging
    • E. Brill, Transformation-based error-driven learning and natural language processing: a case study in part of speech tagging, Comput. Ling. 21(4) (1995).
    • (1995) Comput. Ling , vol.21 , Issue.4
    • Brill, E.1
  • 5
    • 0026406324 scopus 로고
    • Three different probabilistic language models: Comparison and combination
    • H. Cerf-Danon and M. El-Beze, Three different probabilistic language models: comparison and combination, Proc. ICASSP 1 (1991) 297-300.
    • (1991) Proc. ICASSP , vol.1 , pp. 297-300
    • Cerf-Danon, H.1    El-Beze, M.2
  • 6
    • 33947644193 scopus 로고    scopus 로고
    • Morpheme-based language modeling for Arabic LVCSR
    • G. Choueiter, D. Povey, S. F. Chen and G. Zweig, Morpheme-based language modeling for Arabic LVCSR, Proc. ICASSP 1 (2007) 1053-1056.
    • (2007) Proc. ICASSP , vol.1 , pp. 1053-1056
    • Choueiter, G.1    Povey, D.2    Chen, S.F.3    Zweig, G.4
  • 9
    • 0041079008 scopus 로고    scopus 로고
    • Unsupervised learning of the morphology of a natural language
    • J. Goldsmith, Unsupervised learning of the morphology of a natural language, Comput. Ling. 27 (2001) 153-198.
    • (2001) Comput. Ling , vol.27 , pp. 153-198
    • Goldsmith, J.1
  • 10
    • 78650869754 scopus 로고    scopus 로고
    • S. Goldwater, T. L. Griffiths and M. Johnson, Contextual dependencies in unsupervised word segmentation, Proc. Coling/ACL (2006).
    • S. Goldwater, T. L. Griffiths and M. Johnson, Contextual dependencies in unsupervised word segmentation, Proc. Coling/ACL (2006).
  • 11
    • 85037065809 scopus 로고    scopus 로고
    • Tagging inflective languages: Prediction of morphological categories for a rich, structured tagset
    • J. Hajǐc and B. Hladka, Tagging inflective languages: prediction of morphological categories for a rich, structured tagset, Proc. ACL/Coling (1997) pp. 483-490.
    • (1997) Proc. ACL/Coling , pp. 483-490
    • Hajǐc, J.1    Hladka, B.2
  • 13
    • 0027228903 scopus 로고
    • Automatic word classification using simulated annealing
    • M. Jardino and G. Adda, Automatic word classification using simulated annealing, Proc. ICASSP 2 (1993) 41-44.
    • (1993) Proc. ICASSP , vol.2 , pp. 41-44
    • Jardino, M.1    Adda, G.2
  • 15
    • 85009287396 scopus 로고    scopus 로고
    • Issues in design and collection of large telephone speech corpus for slovenian language
    • Z. Kǎcǐc, B. Horvat and A. Zögling, Issues in design and collection of large telephone speech corpus for slovenian language, Proc. LREC (2000), pp. 943-946.
    • (2000) Proc. LREC , pp. 943-946
    • Kǎcǐc, Z.1    Horvat, B.2    Zögling, A.3
  • 16
    • 85009154893 scopus 로고    scopus 로고
    • Speech recognition for huge vocabularies by using optimized subword units
    • J. Kneissler and D. Klakow, Speech recognition for huge vocabularies by using optimized subword units, Proc. Eurospeech (2001), pp. 69-72.
    • (2001) Proc. Eurospeech , pp. 69-72
    • Kneissler, J.1    Klakow, D.2
  • 17
    • 33846956351 scopus 로고    scopus 로고
    • A Markov model for the acquisition of morphological structure
    • Technical Report, CMU-CS-03-147, Carnegie Mellon University, Pittsburgh, PA
    • L. Kontorovich, D. Ron and Y. Singer, A Markov model for the acquisition of morphological structure, Technical Report, CMU-CS-03-147, Carnegie Mellon University, Pittsburgh, PA (2003).
    • (2003)
    • Kontorovich, L.1    Ron, D.2    Singer, Y.3
  • 18
    • 84858385446 scopus 로고    scopus 로고
    • Unlimited vocabulary speech recognition for agglutinative languages
    • North American Chapter of the Association for Computational Linguistics, HLT-NAACL, New York, USA 5-7 June
    • M. Kurimo, A. Puurula, E. Arisoy, V. Siivola, T. Hirsimaki, J. Pylkkonen, T. Alumae and M. Saraclar, Unlimited vocabulary speech recognition for agglutinative languages, Human Language Technology, Conf. North American Chapter of the Association for Computational Linguistics, HLT-NAACL, New York, USA (5-7 June, 2006).
    • (2006) Human Language Technology, Conf
    • Kurimo, M.1    Puurula, A.2    Arisoy, E.3    Siivola, V.4    Hirsimaki, T.5    Pylkkonen, J.6    Alumae, T.7    Saraclar, M.8
  • 19
    • 65249095024 scopus 로고    scopus 로고
    • M. Kurimo, M. Creutz and V. Turunen, Overview of morpho challenge in CLEF 2007, Unsupervised Morpheme Analysis - Morpho Challenge 2007, Working Notes for the CLEF 2007Workshop, Budapest, Hungary, URL: http://www.cis.hut.fi/ morphochallenge2007/.
    • M. Kurimo, M. Creutz and V. Turunen, Overview of morpho challenge in CLEF 2007, Unsupervised Morpheme Analysis - Morpho Challenge 2007, Working Notes for the CLEF 2007Workshop, Budapest, Hungary, URL: http://www.cis.hut.fi/ morphochallenge2007/.
  • 20
    • 0037290509 scopus 로고    scopus 로고
    • Korean large vocabulary continuous speech recognition with morpheme-based recognition units
    • O.-W. Kwon and J. Park, Korean large vocabulary continuous speech recognition with morpheme-based recognition units, Speech Commun. 39(3-4) (2003) 287-300.
    • (2003) Speech Commun , vol.39 , Issue.3-4 , pp. 287-300
    • Kwon, O.-W.1    Park, J.2
  • 21
    • 33847342089 scopus 로고    scopus 로고
    • Handwriting recognition of whiteboard notes studying the influence of training set size and type
    • M. Liwicki and H. Bunke, Handwriting recognition of whiteboard notes studying the influence of training set size and type, Int. J. Patt. Recogn. Artifi. Intell. 21(1) (2007) 83-98.
    • (2007) Int. J. Patt. Recogn. Artifi. Intell , vol.21 , Issue.1 , pp. 83-98
    • Liwicki, M.1    Bunke, H.2
  • 23
    • 85028698566 scopus 로고
    • A technique to automatically assign parts-of-speech to words taking into account word-ending information through a probabilistic model
    • G. Maltese and F. Mancini, A technique to automatically assign parts-of-speech to words taking into account word-ending information through a probabilistic model, Proc. Eurospeech (1991), pp. 753-756.
    • (1991) Proc. Eurospeech , pp. 753-756
    • Maltese, G.1    Mancini, F.2
  • 24
    • 33646907991 scopus 로고    scopus 로고
    • Two decades of statistical language modeling: Where do we go from here?
    • R. Rosenfeld, Two decades of statistical language modeling: where do we go from here? Proc. IEEE 88 (2000) 1270-1278.
    • (2000) Proc. IEEE , vol.88 , pp. 1270-1278
    • Rosenfeld, R.1
  • 25
    • 34250004339 scopus 로고    scopus 로고
    • Large vocabulary continuous speech recognition of an inflected language using stems and endings
    • T. Rotovnik, M. S. Maučec and Z. Kačič, Large vocabulary continuous speech recognition of an inflected language using stems and endings, Speech Commun. 49(6) (2007) 437-452.
    • (2007) Speech Commun , vol.49 , Issue.6 , pp. 437-452
    • Rotovnik, T.1    Maučec, M.S.2    Kačič, Z.3
  • 27
    • 84899019621 scopus 로고    scopus 로고
    • A probabilistic model for learning concatenative morphology
    • M. Snover and M. R. Brent, A probabilistic model for learning concatenative morphology, Proc. NIPS (2003), pp. 1513-1520.
    • (2003) Proc. NIPS , pp. 1513-1520
    • Snover, M.1    Brent, M.R.2
  • 28
    • 84891308106 scopus 로고    scopus 로고
    • SRILM an extensible language modeling toolkit
    • A. Stolcke, SRILM an extensible language modeling toolkit, Proc. ICSLP (2002), 901-904.
    • (2002) Proc. ICSLP , pp. 901-904
    • Stolcke, A.1
  • 29
    • 85050788453 scopus 로고    scopus 로고
    • Court stenography-to-text (STT) in Hong Kong: A jurilinguistic engineering effort
    • B. K. Tsou, T. B. Y. Lai, K. K. Sin and L. Y. L. Cheung, Court stenography-to-text (STT) in Hong Kong: a jurilinguistic engineering effort, Int. J. Patt. Recogn. Artifi. Intell. 19(2/3) (2006) 99-107.
    • (2006) Int. J. Patt. Recogn. Artifi. Intell , vol.19 , Issue.2-3 , pp. 99-107
    • Tsou, B.K.1    Lai, T.B.Y.2    Sin, K.K.3    Cheung, L.Y.L.4
  • 31
    • 84961349892 scopus 로고    scopus 로고
    • Particle-based language modeling
    • Beijing, China
    • E. W. D. Whittaker and P. C. Woodland, Particle-based language modeling, Proc. ICSLP, Beijing, China (2000).
    • (2000) Proc. ICSLP
    • Whittaker, E.W.D.1    Woodland, P.C.2
  • 32
    • 65249149852 scopus 로고
    • Constructing linguistic oriented language models for large vocabulary speech recognition
    • P. Witschel, Constructing linguistic oriented language models for large vocabulary speech recognition, Proc. Eurospeech (1993), pp. 1199-1202.
    • (1993) Proc. Eurospeech , pp. 1199-1202
    • Witschel, P.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.