SCOPUS 정보 검색 플랫폼 - 논문 보기

메뉴 건너뛰기

Computer Speech and Language

Volumn 20, Issue 4, 2006, Pages 515-541

Unlimited vocabulary speech recognition with morph language models applied to Finnish

(6) Hirsimäki, Teemu a Creutz, Mathias a Siivola, Vesa a Kurimo, Mikko a Virpioja, Sami a Pylkkönen, Janne a

a AALTO UNIVERSITY (Finland)

Author keywords

[No Author keywords available]

Indexed keywords

ALGORITHMS; COMPUTER PROGRAMMING LANGUAGES; COMPUTER SIMULATION; MATHEMATICAL MODELS; PROBLEM SOLVING; PROFESSIONAL ASPECTS; WORD PROCESSING;

FINNISH RECOGNITION TASKS; MINIMUM DESCRIPTION LENGTH PRINCIPLE; N-GRAM MODELS; RELATIVE ERROR RATE REDUCTIONS;

SPEECH RECOGNITION;

EID: 33746524944 PISSN: 08852308 EISSN: 10958363 Source Type: Journal
DOI: 10.1016/j.csl.2005.07.002 Document Type: Article

Times cited : (99)

References (43)

1
- 33746549382
- Argamon, S., Akiva, N., Amir, A., Kapah., O., 2004. Efficient unsupervised recursive word segmentation using minimum description length. In: Proceedings of The 20th International Conference on Computational Linguistics (COLING).

2
- 0036460898
- An overview of decoding techniques for large vocabulary continuous speech recognition
- Aubert X.L. An overview of decoding techniques for large vocabulary continuous speech recognition. Computer Speech and Language 16 1 (2002) 89-114
- (2002) Computer Speech and Language , vol.16 , Issue.1 , pp. 89-114
- Aubert, X.L.¹

3
- 33746504143
- Bilmes, J.A., Kirchhoff, K., 2003. Factored language models and generalized parallel backoff. In: Proceedings of the Human Language Technology Conference (HLT/NAACL), pp. 4-6.

4
- 33746528332
- Bonafonte, A., Ros, X., Mariño, J.B., 1993. An efficient algorithm to find the best state sequence in HSMM. In: Proceedings of the 3rd European Conference on Speech Communication and Technology (Eurospeech), pp. 1547-1550.

5
- 0032677683
- An efficient, probabilistically sound algorithm for segmentation and word discovery
- Brent M.R. An efficient, probabilistically sound algorithm for segmentation and word discovery. Machine Learning 34 (1999) 71-105
- (1999) Machine Learning , vol.34 , pp. 71-105
- Brent, M.R.¹

6
- 85009072617
- Byrne, W., Hajič, J., Ircing, P., Jelinek, F., Khudanpur, S., Krbec, P., Psutka, J., 2001. On large vocabulary continuous speech recognition of highly inflectional language - Czech. In: Proceedings of the 7th European Conference on Speech Communication and Technology (Eurospeech), pp. 487-489.

7
- 84947550808
- Byrne, W.J., Hajič, J., Krbec, P., Ircing, P., Psutka, J., 2000. Morpheme based language models for speech recognition of Czech. In: Proceedings of the Third, International Workshop on Text, Speech, Dialogue (TSD), pp. 211-216.

8
- 33746560354
- Chen, S.F., 1996. Building probabilistic models for natural language. Ph.D. Thesis, Harvard University.

9
- 0033329799
- An empirical study of smoothing techniques for language modeling
- Chen S.F., and Goodman J. An empirical study of smoothing techniques for language modeling. Computer Speech and Language 13 4 (1999) 359-393
- (1999) Computer Speech and Language , vol.13 , Issue.4 , pp. 359-393
- Chen, S.F.¹ Goodman, J.²

10
- 33746563698
- Creutz, M., 2003. Unsupervised segmentation of words using prior distributions of morph length and frequency. In: Proceedings of the 41st Annual Meeting of the Association for Computational Linguistics (ACL), pp. 280-287.

11
- 33746546470
- Creutz, M., Lagus, K., 2002. Unsupervised discovery of morphemes. In: Proceedings of the Workshop on Morphological and Phonological Learning of ACL-02, pp. 21-30.

12
- 33746505743
- Creutz, M., Lagus, K., 2005. Unsupervised morpheme segmentation and morphology induction from text corpora using Morfessor 1.0. Tech. Rep. A81, Publications in Computer and Information Science, Helsinki University of Technology. Available from: .

13
- 33746487626
- Creutz, M., Lindén, K., 2004. Morpheme segmentation gold standards for Finnish and English. Tech. Rep. A77, Publications in Computer and Information Science, Helsinki University of Technology. Available from: .

14
- 0031273765
- Inference of variable-length linguistic and acoustic units by multigrams
- Deligne S., and Bimbot F. Inference of variable-length linguistic and acoustic units by multigrams. Speech Communication 23 3 (1997) 223-241
- (1997) Speech Communication , vol.23 , Issue.3 , pp. 223-241
- Deligne, S.¹ Bimbot, F.²

15
- 0032638856
- Semi-tied covariance matrices for hidden Markov models
- Gales M.J.F. Semi-tied covariance matrices for hidden Markov models. IEEE Transactions on Speech and Audio Processing 7 3 (1999) 272-281
- (1999) IEEE Transactions on Speech and Audio Processing , vol.7 , Issue.3 , pp. 272-281
- Gales, M.J.F.¹

16
- 84892143101
- Geutner, P., Finke, M., Scheytt, P., 1998. Adaptive vocabularies for transcribing multilingual broadcast news. In: Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). vol. 2, pp. 925-928.

17
- 0041079008
- Unsupervised learning of the morphology of a natural language
- Goldsmith J. Unsupervised learning of the morphology of a natural language. Computational Linguistics 27 2 (2001) 153-198
- (2001) Computational Linguistics , vol.27 , Issue.2 , pp. 153-198
- Goldsmith, J.¹

18
- 33746541307
- Goodman, J.T., 2001. A bit of progress in language modeling, extended version. Tech. Rep. MSR-TR-2001-72, Microsoft Research, Extended version of a paper with the same title published in Computer Speech and Language 15, 403-434.

19
- 33846253444
- Hacioglu, K., Pellom, B., Ciloglu, T., Ozturk, O., Kurimo, M., Creutz, M., 2003. On lexicon creation for Turkish LVCSR. In: Proceedings of the 8th European Conference on Speech Communication and Technology (Eurospeech), pp. 1165-1168.

20
- 33746554555
- Hakulinen, L., 1979. Suomen kielen rakenne ja kehitys (The structure and development of the Finnish language), 4th Ed., Kustannus-Oy Otava.

21
- 0032687479
- Johnson, S., Jourlin, P., Moore, G., Jones, K.S., Woodland, P., 1999. The Cambridge University spoken document retrieval system. In: Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 49-52.

22
- 0022249890
- Juang, B.H., Rabiner, L.R., Levinson, S.E., Sondhi, M.M., 1985. Recent developments in the application of hidden Markov models to speaker-independent isolated word recognition. In: Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 9-12.

23
- 0141703242
- Kirchhoff, K., Bilmes, J., Das, S., Duta, N., Egan, M., Ji, G., He, F., Henderson, J., Liu, D., Noamany, M., Schone, P., Schwartz, R., Vergyri, D., 2003. Novel approaches to Arabic speech recognition: Report from the 2002 Johns-Hopkins workshop. In: Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), vol. 1, pp. 344-347.

24
- 85009154893
- Kneissler, J., Klakow, D., 2001. Speech recognition for huge vocabularies by using optimized sub-word units. In: Proceedings of the 7th European Conference on Speech Communication and Technology (Eurospeech), pp. 69-72.

25
- 33746528934
- Koskenniemi, K., 1983. Two-level morphology: a general computational model for word-form recognition and production. Ph.D. Thesis, University of Helsinki.

26
- 33746542783
- Kurimo, M., 1997. Using self-organizing maps and learning vector quantization for mixture density hidden Markov models. Ph.D. Thesis, Helsinki University of Technology.

27
- 0037290509
- Korean large vocabulary continuous speech recognition with morpheme-based recognition units
- Kwon O.-W., and Park J. Korean large vocabulary continuous speech recognition with morpheme-based recognition units. Speech Communication 39 3-4 (2003) 287-300
- (2003) Speech Communication , vol.39 , Issue.3-4 , pp. 287-300
- Kwon, O.-W.¹ Park, J.²

28
- 85009228865
- McTait, K., Adda-Decker, M., 2003. The 300k LIMSI German broadcast news transcription system. In: Proceedings of the 8th European Conference on Speech Communication and Technology (Eurospeech), pp. 213-216.

29
- 0003695238
- McGraw-Hill, New York (Chapter 10)
- Milton J.S., and Arnold J.C. Introduction to Probability and Statistics. third ed. (1995), McGraw-Hill, New York (Chapter 10)
- (1995) Introduction to Probability and Statistics. third ed.
- Milton, J.S.¹ Arnold, J.C.²

30
- 85009170957
- Ordelman, R., van Hessen, A., de Jong, F., 2003. Compound decomposition in Dutch large vocabulary speech recognition. In: Proceedings of the 8th European Conference on Speech Communication and Technology (Eurospeech), pp. 225-228.

31
- 11844294898
- Pylkkönen, J., Kurimo, M., 2004. Using phone durations in Finnish large vocabulary continuous speech recognition. In: Proceedings of the 6th Nordic Signal Processing Symposium (Norsig), pp. 324-326.

32
- 0024610919
- A tutorial on hidden Markov models and selected applications in speech recognition
- Rabiner L.R. A tutorial on hidden Markov models and selected applications in speech recognition. Proceedings of the IEEE 77 2 (1989) 257-286
- (1989) Proceedings of the IEEE , vol.77 , Issue.2 , pp. 257-286
- Rabiner, L.R.¹

33
- 0003250456
- Stochastic complexity in statistical inquiry
- Rissanen J. Stochastic complexity in statistical inquiry. World Scientific Series in Computer Science 15 (1989) 79-93
- (1989) World Scientific Series in Computer Science , vol.15 , pp. 79-93
- Rissanen, J.¹

34
- 0023214109
- Russell, M.J., Cook, A.E., 1987. Experimental evaluation of duration modelling techniques for automatic speech recognition. In: Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), vol. 4, pp. 2376-2379.

35
- 85009186256
- Siivola, V., Hirsimäki, T., Creutz, M., Kurimo, M., 2003. Unlimited vocabulary speech recognition based on morphs discovered in an unsupervised manner. In: Proceedings of the 8th European Conference on Speech Communication and Technology (Eurospeech), pp. 2293-2296.

36
- 84891308106
- Stolcke, A., 2002. SRILM-an extensible language modeling toolkit. In: Proceedings of the 7th International Conference on Spoken Language Processing (ICSLP), pp. 901-904. Available from: .

37
- 65349171767
- Szarvas, M., Furui, S., 2003. Evaluation of the stochastic morphosyntactic language model on a one million word Hungarian task. In: Proceedings of the 8th European Conference on Speech Communication and Technology (Eurospeech), pp. 2297-2300.

38
- 0001277731
- A compression based algorithm for Chinese word segmentation
- Teahan W.J., Wen Y., McNab R., and Witten I.H. A compression based algorithm for Chinese word segmentation. Computational Linguistics 26 3 (2000) 375-393
- (2000) Computational Linguistics , vol.26 , Issue.3 , pp. 375-393
- Teahan, W.J.¹ Wen, Y.² McNab, R.³ Witten, I.H.⁴

39
- 0008501167
- A statistical model for word discovery in transcribed speech
- Venkataraman A. A statistical model for word discovery in transcribed speech. Computational Linguistics 27 3 (2001) 352-372
- (2001) Computational Linguistics , vol.27 , Issue.3 , pp. 352-372
- Venkataraman, A.¹

40
- 85009110467
- Vergyri, D., Kirchhoff, K., Duh, K., Stolcke, A., 2004. Morphology-based language modeling for Arabic speech recognition. In: Proceedings of the 8th International Conference on Spoken Language Processing (ICSLP), vol. 3, pp. 2245-2248.

41
- 85009115852
- Whittaker, E., Raj, B., 2001. Quantization-based language model compression. In: Proceedings of the 7th European Conference on Speech Communication and Technology (Eurospeech), vol. 1, pp. 33-36.

42
- 84961349892
- Whittaker, E., Woodland, P., 2000. Particle-based language modelling. In: Proceedings of the 6th International Conference on Spoken Language Processing (ICSLP), pp. 170-173.

43
- 0033708110
- Willett, D., Neukirchen, C., Rigoll, G., 2000. Ducoder - the Duisburg University LVCSR stackdecoder. In: Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1555-1558.

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.