SCOPUS 정보 검색 플랫폼

IEEE Transactions on Audio, Speech and Language Processing

Volumn 14, Issue 3, 2006, Pages 981-989

MLP-based phone boundary refining for a TTS database

(1) Lee, Ki Seung a,b

a Konkuk University ^*

b Konkuk University (South Korea)

Author keywords

Automatic labeling; Multilayer perceptron; Phoneme boundary refinement; Text to spccch synthesis

Indexed keywords

DATABASE SYSTEMS; MARKOV PROCESSES; MULTILAYER NEURAL NETWORKS; SPEECH PROCESSING; SPEECH RECOGNITION;

AUTOMATIC LABELING; PHONE BOUNDARIES; PHONEME BOUNDARY REFINEMENT; PHONETIC TRANSITION;

SPEECH SYNTHESIS;

EID: 34047273929 PISSN: 15587916 EISSN: None Source Type: Journal
DOI: 10.1109/TSA.2005.858049 Document Type: Article

Times cited : (25)

References (22)

1
- 85029935165
- A. J. Hunt and A. W. Black, Concatenalive Speech Synthesis Using Units Selected From a Large Speech Database
- A. J. Hunt and A. W. Black, Concatenalive Speech Synthesis Using Units Selected From a Large Speech Database.

2
- 0002425861
- The AT&T next-gen TTS system
- Berlin, Germany, Mar
- M. Beutnagel, A. Conkie, J. Schroeter, Y. Stylianou, and A. Syrdal, "The AT&T next-gen TTS system," in Proc. Joint Meeting of ASA, EAA, and DAGA, Berlin, Germany, Mar. 1999.
- (1999) Proc. Joint Meeting of ASA, EAA, and DAGA
- Beutnagel, M.¹ Conkie, A.² Schroeter, J.³ Stylianou, Y.⁴ Syrdal, A.⁵

3
- 0026140140
- Automatic segmentation of speech
- Apr
- J. P. van Hermert, "Automatic segmentation of speech," IEEE Trans. Signal Process., vol. 39, no. 4, pp. 1008-1012, Apr. 1991.
- (1991) IEEE Trans. Signal Process , vol.39 , Issue.4 , pp. 1008-1012
- van Hermert, J.P.¹

4
- 0020141539
- A bootstaping training technique for obtaining demisyllable reference patterns
- L. R. Rabiner, A. E. Rosenberg, J. G. Wilpon, and T. M. Zampini, "A bootstaping training technique for obtaining demisyllable reference patterns," J. Acoust. Soc. Amer., vol. 71, pp. 1588-1595, 1982.
- (1982) J. Acoust. Soc. Amer , vol.71 , pp. 1588-1595
- Rabiner, L.R.¹ Rosenberg, A.E.² Wilpon, J.G.³ Zampini, T.M.⁴

5
- 0030364795
- Explicit segmentation of speech usnig gaussian models
- A. Bonafonte, A. Nogueiras, and A. R. Garrido, "Explicit segmentation of speech usnig gaussian models," in Proc. IEEE Int. Conf. Spoken Language Processing, 1996, pp. 1269-1272.
- (1996) Proc. IEEE Int. Conf. Spoken Language Processing , pp. 1269-1272
- Bonafonte, A.¹ Nogueiras, A.² Garrido, A.R.³

6
- 0023211850
- On the automatic segmentation of speech signals
- T. Svendsen and F. Soong, "On the automatic segmentation of speech signals," in Proc. IEEE Int. Conf. Acoustics, Speech, Signal Processing, 1987, pp. 77-80.
- (1987) Proc. IEEE Int. Conf. Acoustics, Speech, Signal Processing , pp. 77-80
- Svendsen, T.¹ Soong, F.²

7
- 0033694372
- Neural network boundary refining for automatic speech segmentation
- D. T. Toledano, "Neural network boundary refining for automatic speech segmentation," in Proc. IEEE Int. Conf. Acoustics, Speech, Signal Processing, 2000, pp. 3438-3441.
- (2000) Proc. IEEE Int. Conf. Acoustics, Speech, Signal Processing , pp. 3438-3441
- Toledano, D.T.¹

8
- 0031642265
- Automatic generation of synthesis units for trainable text-to-speech system
- H. Hon, A. Acero, X. Huang, J. Liu, and M. Plumpe, "Automatic generation of synthesis units for trainable text-to-speech system," in Proc. IEEE Int. Conf. Acoust., Speech, Signal Processing, 1998, pp. 293-296.
- (1998) Proc. IEEE Int. Conf. Acoust., Speech, Signal Processing , pp. 293-296
- Hon, H.¹ Acero, A.² Huang, X.³ Liu, J.⁴ Plumpe, M.⁵

9
- 0026392350
- Automatic segmentation and labeling of speech
- A. Ljolje and M. D. Riley, "Automatic segmentation and labeling of speech," in Proc. IEEE Int. Conf. Acoust., Speech, Signal Processing, 1991, pp. 473-476.
- (1991) Proc. IEEE Int. Conf. Acoust., Speech, Signal Processing , pp. 473-476
- Ljolje, A.¹ Riley, M.D.²

10
- 0024928725
- A phonetically labeled acoustic segment (PLAS) approach to speech analysis-synthesis
- F.-K. Soong, "A phonetically labeled acoustic segment (PLAS) approach to speech analysis-synthesis," in Proc. IEEE Int. Conf. Acoustics.. Speech, Signal Processing, 1989, pp. 584-587.
- (1989) Proc. IEEE Int. Conf. Acoustics.. Speech, Signal Processing , pp. 584-587
- Soong, F.-K.¹

11
- 0024878054
- A knowledge based approach for automatic labeling of a large speech database
- J.-C. Junqua and H. Wakita, "A knowledge based approach for automatic labeling of a large speech database," in Proc. Electrotechnical Conf.: Integrating Research, Industry and Education in Energy and Communication Engineering, 1989, pp. 237-240.
- (1989) Proc. Electrotechnical Conf.: Integrating Research, Industry and Education in Energy and Communication Engineering , pp. 237-240
- Junqua, J.-C.¹ Wakita, H.²

12
- 0029228688
- Automatic speech segmentation using neural tree networks
- M. Sharma and R. Mammone, "Automatic speech segmentation using neural tree networks," in Proc. IEEE Workshop, Neural Networks for Signal Processing, 1995, pp. 282-290.
- (1995) Proc. IEEE Workshop, Neural Networks for Signal Processing , pp. 282-290
- Sharma, M.¹ Mammone, R.²

13
- 0004131347
- Trainable Speech Synthesis,
- Ph.D. dissertation, Dept. Eng, Cambridge Univ, Cambridge, U.K, Jun
- R. E. Donovan, "Trainable Speech Synthesis," Ph.D. dissertation, Dept. Eng., Cambridge Univ., Cambridge, U.K., Jun. 1996.
- (1996)
- Donovan, R.E.¹

14
- 0003822743
- Cambridge, U.K, Cambridge Univ. Press
- S. Young, D. Kershaw, J. Odell, D. Ollason, V. Valtchev, and P. Woodland, The HTK Book. Cambridge, U.K.: Cambridge Univ. Press, 2003.
- (2003) The HTK Book
- Young, S.¹ Kershaw, D.² Odell, J.³ Ollason, D.⁴ Valtchev, V.⁵ Woodland, P.⁶

15
- 0038472261
- Context independent and context dependent hybrid HMNI/ANN systems for vocaburary independent tasks
- S. Dupont, C. Ris, O. Deroo, and V. Fontaine, "Context independent and context dependent hybrid HMNI/ANN systems for vocaburary independent tasks," in Proc. European Conf. Speech Communication and Technology, 1997, pp. 1947-1950.
- (1997) Proc. European Conf. Speech Communication and Technology , pp. 1947-1950
- Dupont, S.¹ Ris, C.² Deroo, O.³ Fontaine, V.⁴

16
- 27644558639
- High-accuracy automatic segmentation
- J. P. H. van Santen and R. W. Sproat, "High-accuracy automatic segmentation," in Proc. Eurospeech 99, 1999, pp. 2809-2812.
- (1999) Proc. Eurospeech 99 , pp. 2809-2812
- van Santen, J.P.H.¹ Sproat, R.W.²

17
- 85029957151
- Phonetic alignment: Speech synthesis-based vs. viterbi-based
- to be published
- F. Malfrère, O. Deroo, T. Dutoit, and C. Ris, "Phonetic alignment: speech synthesis-based vs. viterbi-based," Speech Commun., to be published.
- Speech Commun
- Malfrère, F.¹ Deroo, O.² Dutoit, T.³ Ris, C.⁴

18
- 0033351870
- Automatic speech synthesis unit generation with MLP based postprocessor against auto-segmented phoneme errors
- E.-Y. Park, S.-H. Kim, and J.-H. Chung, "Automatic speech synthesis unit generation with MLP based postprocessor against auto-segmented phoneme errors," in Proc. Int. Joint Conf. Neural Networks, 1999, pp. 2985-2990.
- (1999) Proc. Int. Joint Conf. Neural Networks , pp. 2985-2990
- Park, E.-Y.¹ Kim, S.-H.² Chung, J.-H.³

19
- 0005305034
- Nonlinear predictive vector quantization with recurrent neural nets
- Baltimore, MD
- L. Wu, M. Niranjan, and F. Fallside, "Nonlinear predictive vector quantization with recurrent neural nets," in Proc. IEEE-SP Workshop on Neural Networks for Signal Processing, Baltimore, MD, 1993, pp. 372-381.
- (1993) Proc. IEEE-SP Workshop on Neural Networks for Signal Processing , pp. 372-381
- Wu, L.¹ Niranjan, M.² Fallside, F.³

20
- 0005255564
- Spectral stability based event localizing temporal decomposition
- A. C. R. Nandasena and M. Akagi, "Spectral stability based event localizing temporal decomposition," in Proc. IEEE Int. Conf.Acoust., Speech, Signal Processing, 1998, pp. 3438-3441.
- (1998) Proc. IEEE Int. Conf.Acoust., Speech, Signal Processing , pp. 3438-3441
- Nandasena, A.C.R.¹ Akagi, M.²

21
- 0035127353
- Reducing audible spectral discontinuities
- Jan
- E. Klabbers and R. Veldhuis, "Reducing audible spectral discontinuities," IEEE Trans. Speech Audio Signal Process., vol. 9, no. 1, pp. 39-51, Jan. 2001.
- (2001) IEEE Trans. Speech Audio Signal Process , vol.9 , Issue.1 , pp. 39-51
- Klabbers, E.¹ Veldhuis, R.²

22
- 0023331258
- An introduction to computing with neural nets
- Apr
- R. P. Lippmann, "An introduction to computing with neural nets," IEEE ASSP Mag., pp. 4-22, Apr. 1987.
- (1987) IEEE ASSP Mag , pp. 4-22
- Lippmann, R.P.¹

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.