SCOPUS 정보 검색 플랫폼

Speech Communication

Volumn 40, Issue 4, 2003, Pages 503-515

Phonetic alignment: Speech synthesis-based vs. Viterbi-based

(4) Malfrere F a,b Deroo, O a,b Dutoit, T a Ris, C a

a FACULTÉ POLYTECHNIQUE DE MONS (Belgium)

b Babel Technologies SA (Belgium)

Author keywords

Hidden Markov models; Hybrid HMM ANN systems; Large speech corpora; Speech segmentation; Speech synthesis

Indexed keywords

CONTINUOUS SPEECH RECOGNITION; DATABASE SYSTEMS; MARKOV PROCESSES; SPEECH ANALYSIS;

HIDDEN MARKOV MODELS (HMM);

SPEECH SYNTHESIS;

EID: 0037850986 PISSN: 01676393 EISSN: None Source Type: Journal
DOI: 10.1016/S0167-6393(02)00131-0 Document Type: Article

Times cited : (44)

References (36)

1
- 0028996843
- Performance of the IBM large vocabulary continuous speech recognition system on the ARPA Wall Street Journal task
- Bahl L.R., Balakrishnan-Aiyer S., Bellegarda J., Franz M., Gopalakrishnan P., Nahamoo D., Novak M., Padmanabhan M., Picheny M., Roukos S. Performance of the IBM large vocabulary continuous speech recognition system on the ARPA Wall Street Journal task. Proceedings of the International Conference on Acoustics Speech and Signal Processing. 1995;41-44.
- (1995) Proceedings of the International Conference on Acoustics Speech and Signal Processing , pp. 41-44
- Bahl, L.R.¹ Balakrishnan-Aiyer, S.² Bellegarda, J.³ Franz, M.⁴ Gopalakrishnan, P.⁵ Nahamoo, D.⁶ Novak, M.⁷ Padmanabhan, M.⁸ Picheny, M.⁹ Roukos, S.¹⁰

2
- 0016663359
- The Dragon system - An overwiew
- Baker J.K. The Dragon system. - an overwiew IEEE Trans. Acoust. Speech Signal Process. 1975;24-29.
- (1975) IEEE Trans. Acoust. Speech Signal Process. , pp. 24-29
- Baker, J.K.¹

3
- 0001862769
- An inequally and associated maximization technique in statistical estimation of probabilistic functions of Markov processes
- Baum L.E. An inequally and associated maximization technique in statistical estimation of probabilistic functions of Markov processes. Inequalities. 3:1972;1-8.
- (1972) Inequalities , vol.3 , pp. 1-8
- Baum, L.E.¹

4
- 0003573244
- Kluwer Academic Publishers
- Bourlard H., Morgan N. Connectionist Speech Recognition. - A Hybrid Approach:1994;Kluwer Academic Publishers.
- (1994) Connectionist Speech Recognition - A Hybrid Approach
- Bourlard, H.¹ Morgan, N.²

5
- 0027646354
- Automatic segmentation and labeling of speech based on hidden Markov models
- Brugnara B., Falavigna D., Omologo M. Automatic segmentation and labeling of speech based on hidden Markov models. Speech Commun. 1993;357-370.
- (1993) Speech Commun. , pp. 357-370
- Brugnara, B.¹ Falavigna, D.² Omologo, M.³

6
- 0021157392
- The French language database: Defining, planning and recording a large database
- Carré, R., Descoudt, R., Eskénazi, M., Mariani, J., Rossi, M., 1984. The French language database: defining, planning and recording a large database. In: Proceedings of the International Conference on Acoustics, Speech and Signal Processing.
- (1984) Proceedings of the International Conference on Acoustics, Speech and Signal Processing
- Carré, R.¹ Descoudt, R.² Eskénazi, M.³ Mariani, J.⁴ Rossi, M.⁵

7
- 85078422187
- A preliminary statistical evaluation of manual and automatic segmentation
- Cosi P., Falavigna D., Olmologo M. A preliminary statistical evaluation of manual and automatic segmentation. Proceedings of the European Conference on Speech Communication and Technology. 1991;693-696.
- (1991) Proceedings of the European Conference on Speech Communication and Technology , pp. 693-696
- Cosi, P.¹ Falavigna, D.² Olmologo, M.³

8
- 0038472315
- Comparison of two different alignment systems: Speech synthesis vs. hybrid HMM/ANN
- Greece: Rhodes
- Deroo O., Malfrère F., Dutoit T. Comparison of two different alignment systems: speech synthesis vs. hybrid HMM/ANN. Proceedings of the European Conference on Signal Processing (EUSIPCO'98). 1998;1161-1164 Rhodes, Greece.
- (1998) Proceedings of the European Conference on Signal Processing (EUSIPCO'98) , pp. 1161-1164
- Deroo, O.¹ Malfrère, F.² Dutoit, T.³

9
- 85009119851
- Automatic detection and correction of pronunciation errors for foreign language learners: The DEMOSTHENES application
- Deville G., Deroo O., Gielen S., Leich H., Van Parys J. Automatic detection and correction of pronunciation errors for foreign language learners: the DEMOSTHENES application. Proceedings of the European Conference on Speech Communication and Technology. 1999;843-846.
- (1999) Proceedings of the European Conference on Speech Communication and Technology , pp. 843-846
- Deville, G.¹ Deroo, O.² Gielen, S.³ Leich, H.⁴ Van Parys, J.⁵

10
- 0038472261
- Context independent and context dependent hybrid HMM/ANN systems for vocabulary independent tasks
- Dupont S., Ris C., Deroo O., Fontaine V. Context independent and context dependent hybrid HMM/ANN systems for vocabulary independent tasks. Proceedings of the European Conference on Speech Communication and Technology. 1997;1947-1950.
- (1997) Proceedings of the European Conference on Speech Communication and Technology , pp. 1947-1950
- Dupont, S.¹ Ris, C.² Deroo, O.³ Fontaine, V.⁴

11
- 0030355972
- The MBROLA project: Towards a set of high quality speech synthesizers free for use for non-commercial purposes
- Dutoit T., Pagel V., Pierret N., Bataille F., Van Der Vreken O. The MBROLA project: towards a set of high quality speech synthesizers free for use for non-commercial purposes. International Conference on Speech and Language Processing. 1996;1393-1396.
- (1996) International Conference on Speech and Language Processing , pp. 1393-1396
- Dutoit, T.¹ Pagel, V.² Pierret, N.³ Bataille, F.⁴ Van Der Vreken, O.⁵

12
- 0028464214
- Context-dependent connectionist probability estimation in a hybrid hidden Markov model-neural net speech recognition system
- Franco H., Cohen M., Morgan N., Rumelhart D., Abrash V. Context-dependent connectionist probability estimation in a hybrid hidden Markov model-neural net speech recognition system. Comput. Speech Lang. 1994;211-222.
- (1994) Comput. Speech Lang. , pp. 211-222
- Franco, H.¹ Cohen, M.² Morgan, N.³ Rumelhart, D.⁴ Abrash, V.⁵

13
- 0025041264
- Perceptual linear predictive analysis of speech
- Hermansky H. Perceptual linear predictive analysis of speech. J. Acoust. Soc. Am. 1990.
- (1990) J. Acoust. Soc. Am
- Hermansky, H.¹

14
- 0003296899
- The 1994 ABBOT hybrid connectionist-HMM large vocabulary recognition system
- Hochberg M., Cook G.D., Renals S., Robinson A.J., Schechtman R.S. The 1994 ABBOT hybrid connectionist-HMM large vocabulary recognition system. Spoken Language Systems Technology Workshop. 1995;170-176.
- (1995) Spoken Language Systems Technology Workshop , pp. 170-176
- Hochberg, M.¹ Cook, G.D.² Renals, S.³ Robinson, A.J.⁴ Schechtman, R.S.⁵

15
- 0038133213
- Automatic speech segmentation based on DTW with the application of the Czech TTS system
- E. Keller, G. Bailly, A. Monaghan, J. Terken, & M. Huckwale. John Wiley and Sons Ltd.
- Horak P. Automatic speech segmentation based on DTW with the application of the Czech TTS system. Keller E., Bailly G., Monaghan A., Terken J., Huckwale M. Improvements in Speech Synthesis. 2001;331-340 John Wiley and Sons Ltd.
- (2001) Improvements in Speech Synthesis , pp. 331-340
- Horak, P.¹

16
- 0029765811
- Unit selection in a concatenative speech synthesis system using large speech database
- Hunt A.J., Black A.W. Unit selection in a concatenative speech synthesis system using large speech database. Proceedings of the International Conference on Acoustics, Speech and Signal Processing. 1996;373-376.
- (1996) Proceedings of the International Conference on Acoustics, Speech and Signal Processing , pp. 373-376
- Hunt, A.J.¹ Black, A.W.²

17
- 0016939124
- Continuous speech recognition by statistical methods
- Jelinek F. Continuous speech recognition by statistical methods. Proc. IEEE. 1976;532-536.
- (1976) Proc. IEEE , pp. 532-536
- Jelinek, F.¹

18
- 0022884140
- On the use of bandpass liftering in speech recognition
- Juang B.H., Rabiner L.R., Wilpon J.G. On the use of bandpass liftering in speech recognition. Proceedings of the International Conference on Acoustics Speech and Signal Processing. 1986;765-768.
- (1986) Proceedings of the International Conference on Acoustics Speech and Signal Processing , pp. 765-768
- Juang, B.H.¹ Rabiner, L.R.² Wilpon, J.G.³

19
- 85078554908
- Integrating RASTA-PLP into speech recognition
- Adelaide, Australia, April
- Koehler, J., Morgan, N., Hermansky, H., Hirsch, H.G., Tong, G., 1994. INTEGRATING RASTA-PLP INTO SPEECH RECOGNITION. In: Proceedings of the International Conference on Acoustics Speech and Signal Processing. Adelaide, Australia, April, pp. I-421-I-424.
- (1994) Proceedings of the International Conference on Acoustics Speech and Signal Processing
- Koehler, J.¹ Morgan, N.² Hermansky, H.³ Hirsch, H.G.⁴ Tong, G.⁵

20
- 85123169429
- BREF, a large vocabulary spoken corpus for French
- Lamel L.F., Gauvain J.L., Eskenazi M. BREF, a large vocabulary spoken corpus for French. Proceedings of the European Conference on Speech Communication and Technology. 1991;505-508.
- (1991) Proceedings of the European Conference on Speech Communication and Technology , pp. 505-508
- Lamel, L.F.¹ Gauvain, J.L.² Eskenazi, M.³

21
- 0031647824
- A frequency warping approach to speaker normalization
- Lee L., Rose R. A frequency warping approach to speaker normalization. IEEE Trans. Speech Audio Process. 6(1):1998;49-60.
- (1998) IEEE Trans. Speech Audio Process. , vol.6 , Issue.1 , pp. 49-60
- Lee, L.¹ Rose, R.²

22
- 78649945547
- Diphone collection and synthesis
- Beijing, China
- Lenzo, K., Black, A.W., 2000. Diphone collection and synthesis. In: Proceedings of the International Conference on Speech and Language Processing. Beijing, China.
- (2000) Proceedings of the International Conference on Speech and Language Processing
- Lenzo, K.¹ Black, A.W.²

23
- 0021208794
- A procedure for automatic alignment of phonetic transcriptions with continuous speech
- Leung H.C., Zue V.W. A procedure for automatic alignment of phonetic transcriptions with continuous speech. Proceedings of the International Conference on Acoustics, Speech and Signal Processing. 1984;2.7.1-2.7.4.
- (1984) Proceedings of the International Conference on Acoustics, Speech and Signal Processing , pp. 271-274
- Leung, H.C.¹ Zue, V.W.²

24
- 0026392350
- Automatic segmentation and labeling of speech
- Ljolje A., Riley M.D. Automatic segmentation and labeling of speech. Proceedings International Conference on Acoustics, Speech and Signal Processing. 1991;473-476.
- (1991) Proceedings International Conference on Acoustics, Speech and Signal Processing , pp. 473-476
- Ljolje, A.¹ Riley, M.D.²

25
- 0004565879
- High-quality speech synthesis for phonetic speech segmentation
- Malfrère F., Dutoit T. High-quality speech synthesis for phonetic speech segmentation. Proceedings of the European Conference on Speech Communication and Technology. 1997;2631-2634.
- (1997) Proceedings of the European Conference on Speech Communication and Technology , pp. 2631-2634
- Malfrère, F.¹ Dutoit, T.²

26
- 0019558276
- A level building dynamic time warping algorithm for connected word recognition
- Myers, C.S., Rabiner, L.R., 1981. A level building dynamic time warping algorithm for connected word recognition. In: Proceedings of the International Conference on Acoustics, Speech and Signal Processing.
- (1981) Proceedings of the International Conference on Acoustics, Speech and Signal Processing
- Myers, C.S.¹ Rabiner, L.R.²

27
- 0012330750
- The design for the Wall Street Journal-based CSR Corpus
- Morgan Kaufmann Publishers
- Paul D.B., Baker J. The design for the Wall Street Journal-based CSR Corpus. DARPA Speech and Language Workshop. 1992;Morgan Kaufmann Publishers.
- (1992) DARPA Speech and Language Workshop
- Paul, D.B.¹ Baker, J.²

28
- 0004244302
- PTR Prentice Hall
- Rabiner L.R., Juang B.-H. Fundamentals of Speech Recognition. 1993;PTR Prentice Hall.
- (1993) Fundamentals of Speech Recognition
- Rabiner, L.R.¹ Juang, B.-H.²

29
- 0028392167
- An application of recurrent nets to phone probability estimation
- Robinson A.J. An application of recurrent nets to phone probability estimation. Proc. IEEE Trans. Neural Network. 1994;298-305.
- (1994) Proc. IEEE Trans. Neural Network , pp. 298-305
- Robinson, A.J.¹

30
- 0000329355
- A reccurent error propagation network speech recognition system
- Robinson A.J., Fallside F. A reccurent error propagation network speech recognition system. Comput. Speech Lang. 1991;257-286.
- (1991) Comput. Speech Lang. , pp. 257-286
- Robinson, A.J.¹ Fallside, F.²

31
- 0025629492
- The ARM continuous speech recognition system
- Russell M.J., Ponting K.M., Peeling S.M., Browning S.R., Briddle J.S., Moore R.K., Galiano I., Howell P. The ARM continuous speech recognition system. Proceedings of the International Conference on Acoustics Speech Signal Processing. 1990;69-72.
- (1990) Proceedings of the International Conference on Acoustics Speech Signal Processing , pp. 69-72
- Russell, M.J.¹ Ponting, K.M.² Peeling, S.M.³ Browning, S.R.⁴ Briddle, J.S.⁵ Moore, R.K.⁶ Galiano, I.⁷ Howell, P.⁸

32
- 84889551281
- The aligner: Text-to-speech alignment using Markov models and a pronunciation dictionary
- Talkin D., Wightman C.W. The aligner: text-to-speech alignment using Markov models and a pronunciation dictionary. Proceedings of Second ESCA/IEEE Workshop on Speech Synthesis. 1996;89-92.
- (1996) Proceedings of Second ESCA/IEEE Workshop on Speech Synthesis , pp. 89-92
- Talkin, D.¹ Wightman, C.W.²

33
- 0038810066
- PhD Thesis, ETH Zurich
- Traber, C., 1995. SVOX: The Implementation of a Text-to-Speech System for German, PhD Thesis, ETH Zurich.
- (1995) SVOX: The Implementation of a Text-to-Speech System for German
- Traber, C.¹

34
- 0010467117
- PROTRAN: A prosody transplantation tool for text-to-speech applications
- Van Coile, B., Van Tichelen, L., Vostermans, A., Wang, J.W., Staessen, M., 1994. PROTRAN: A prosody transplantation tool for text-to-speech applications. In: Proceedings of ICSLP'94.
- (1994) Proceedings of ICSLP'94
- Van Coile, B.¹ Van Tichelen, L.² Vostermans, A.³ Wang, J.W.⁴ Staessen, M.⁵

35
- 0028996852
- The 1994 HTK large vocabulary speech recognition system
- Woodland P.C., Leggetter C.J., Odell J.J., Valtchev V., Young S. The 1994 HTK large vocabulary speech recognition system. Proceedings of the International Conference on Acoustics, Speech and Signal Processing. 1995;73-76.
- (1995) Proceedings of the International Conference on Acoustics, Speech and Signal Processing , pp. 73-76
- Woodland, P.C.¹ Leggetter, C.J.² Odell, J.J.³ Valtchev, V.⁴ Young, S.⁵

36
- 0025477640
- Speech database development: TIMIT and beyond
- Zue V., Seneff S., Glass J. Speech Database Development: TIMIT and Beyond. Speech Commun. 1990;351-356.
- (1990) Speech Commun. , pp. 351-356
- Zue, V.¹ Seneff, S.² Glass, J.³

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.