SCOPUS 정보 검색 플랫폼

IEEE Transactions on Audio, Speech and Language Processing

Volumn 15, Issue 8, 2007, Pages 2202-2212

On using multiple models for automatic speech segmentation

(2) Park, Seung Seop a Kim, Nam Soo a

a Seoul National University (South Korea)

Author keywords

Automatic speech segmentation; Speech synthesis; Unit selection

Indexed keywords

AUTOMATIC SEGMENTATIONS; AUTOMATIC SPEECH SEGMENTATION; BIAS PARAMETERS; CONTEXT DEPENDENTS; GRADIENT PROJECTION METHODS; MANUAL SEGMENTATIONS; MULTIPLE MODELS; PARAMETER SPACES; SELECTION BASED; SQUARED ERRORS; TEXT-TO-SPEECH SYSTEMS; TRAINING PROCEDURES; UNIT SELECTION; WEIGHTED SUMS;

DECISION TREES; SPEECH SYNTHESIS; THREE TERM CONTROL SYSTEMS;

LINGUISTICS;

EID: 64149096086 PISSN: 15587916 EISSN: None Source Type: Journal
DOI: 10.1109/TASL.2007.903933 Document Type: Article

Times cited : (27)

References (40)

1
- 0029765811
- Unit selection in a concatenative speech synthesis system using a large speech database
- Atlanta, GA
- A. J. Hunt and A. W. Black, "Unit selection in a concatenative speech synthesis system using a large speech database," in Proc. ICASSP, Atlanta, GA, 1996, pp. 373-376.
- (1996) Proc. ICASSP , pp. 373-376
- Hunt, A.J.¹ Black, A.W.²

2
- 0342626572
- The Bell Labs German text-to-speech system
- B.Möbius, "The Bell Labs German text-to-speech system," in Comput. Speech Lang., 1999, vol. 13, pp. 319-358.
- (1999) Comput. Speech Lang , vol.13 , pp. 319-358
- Möbius, B.¹

3
- 85001632375
- Corpus-based techniques in the AT&T NextGen synthesis system
- Beijing, China, Oct
- A. K. Syrdal, C. W.Wightman, A. Conkie, Y. Stylianou, M. Beutnagel, J. Schroeter, V. Strom, K.-S. Lee, and M. J. Makashay, "Corpus-based techniques in the AT&T NextGen synthesis system," in Proc. ICSLP, Beijing, China, Oct. 2000, vol. 3, pp. 410-415.
- (2000) Proc. ICSLP , vol.3 , pp. 410-415
- Syrdal, A.K.¹ Wightman, C.W.² Conkie, A.³ Stylianou, Y.⁴ Beutnagel, M.⁵ Schroeter, J.⁶ Strom, V.⁷ Lee, K.-S.⁸ Makashay, M.J.⁹

4
- 84985926077
- Segment selection in the L&H Realspeak laboratory TTS system
- Beijing, China, Oct
- G. Coorman, J. Fackrell, P. Rutten, and B. Van Coile, "Segment selection in the L&H Realspeak laboratory TTS system," in Proc. ICSLP, Beijing, China, Oct. 2000, vol. 2, pp. 395-398.
- (2000) Proc. ICSLP , vol.2 , pp. 395-398
- Coorman, G.¹ Fackrell, J.² Rutten, P.³ Van Coile, B.⁴

5
- 0043271978
- Methods for optimal text selection
- Rhodes, Greece
- J. P. H. van Santen and A. L. Buchsbaum, "Methods for optimal text selection," in Proc. Eurospeech, Rhodes, Greece, 1997, pp. 553-556.
- (1997) Proc. Eurospeech , pp. 553-556
- van Santen, J.P.H.¹ Buchsbaum, A.L.²

6
- 85135154775
- Combinatorial issues in text-to-speech synthesis
- Rhodes, Greece
- J. P. H. van Santen, "Combinatorial issues in text-to-speech synthesis," in Proc. Eurospeech, Rhodes, Greece, 1997, pp. 2511-2514.
- (1997) Proc. Eurospeech , pp. 2511-2514
- van Santen, J.P.H.¹

7
- 0000237685
- Prosody and the selection of source units for concatenative synthesis
- New York: Springer-Verlag
- W. N. Campbell and A. Black, "Prosody and the selection of source units for concatenative synthesis," in Progress in Speech Synthesis. New York: Springer-Verlag, 1997, pp. 279-292.
- (1997) Progress in Speech Synthesis , pp. 279-292
- Campbell, W.N.¹ Black, A.²

8
- 0142153902
- Optimal utterance selection for unit selection speech synthesis databases
- Norwell, MA: Kluwer
- A. Black and K. Lenzo, "Optimal utterance selection for unit selection speech synthesis databases," in International Journal of Speech Technology 6. Norwell, MA: Kluwer, 2003, pp. 357-363.
- (2003) International Journal of Speech Technology 6 , pp. 357-363
- Black, A.¹ Lenzo, K.²

9
- 0026392350
- Automatic segmentation and labeling of speech
- Toronto, ON, Canada
- A. Ljolje and M. D. Riley, "Automatic segmentation and labeling of speech," in Proc. ICASSP, Toronto, ON, Canada, 1992, pp. 473-476.
- (1992) Proc. ICASSP , pp. 473-476
- Ljolje, A.¹ Riley, M.D.²

10
- 64149090049
- A HMM-based system for automatic segmentation and labelling of speech
- Banff, Canada
- F. Brugnara, D. Falavigna, and M. Omologo, "A HMM-based system for automatic segmentation and labelling of speech," in Proc. ICSLP, Banff, Canada, 1992, pp. 803-806.
- (1992) Proc. ICSLP , pp. 803-806
- Brugnara, F.¹ Falavigna, D.² Omologo, M.³

11
- 0004131347
- Trainable Speech Synthesis,
- Ph.D. dissertation, Cambrige Univ, Cambridge, U.K
- R. E. Donovan, "Trainable Speech Synthesis," Ph.D. dissertation, Cambrige Univ., Cambridge, U.K., 1996.
- (1996)
- Donovan, R.E.¹

12
- 19944409831
- Unsupervised, language-independent grapheme-tophoneme conversion by latent analogy
- J. R. Bellegarda, "Unsupervised, language-independent grapheme-tophoneme conversion by latent analogy," Speech Commun., vol. 46/2, pp. 140-152, 2005.
- (2005) Speech Commun , vol.46 , Issue.2 , pp. 140-152
- Bellegarda, J.R.¹

13
- 38148999392
- Automatic phonetic transcription of large speech corpora: A comparative study
- Pittsburgh, PA
- C. Van Bael, L. Boves, H. van den Heuvel, and H. Strik, "Automatic phonetic transcription of large speech corpora: A comparative study," in Proc. ICSLP, Pittsburgh, PA, 2006, pp. 1085-1088.
- (2006) Proc. ICSLP , pp. 1085-1088
- Van Bael, C.¹ Boves, L.² van den Heuvel, H.³ Strik, H.⁴

14
- 0347968276
- Automatic phonetic segmentation
- Nov
- D. T. Toledano, L. A. H. Gömez, and L. V. Grande, "Automatic phonetic segmentation," IEEE Trans. Speech Audio Process., vol. 11, no. 6, pp. 617-625, Nov. 2003.
- (2003) IEEE Trans. Speech Audio Process , vol.11 , Issue.6 , pp. 617-625
- Toledano, D.T.¹ Gömez, L.A.H.² Grande, L.V.³

15
- 4544324769
- An evaluation of automatic phone segmentation for concatenative speech synthesis
- Montreal, QC, Canada
- H. Kawai and T. Toda, "An evaluation of automatic phone segmentation for concatenative speech synthesis," in Proc. ICASSP, Montreal, QC, Canada, 2004, vol. I, pp. 677-680.
- (2004) Proc. ICASSP , vol.1 , pp. 677-680
- Kawai, H.¹ Toda, T.²

16
- 64149113705
- Acoustical and topological experiments for an HMM-based speech segementation system
- Aalborg, Denmark
- S. Nefti and O. Boëffard, "Acoustical and topological experiments for an HMM-based speech segementation system," in Proc. Eurospeech, Aalborg, Denmark, 2001, pp. 1711-1714.
- (2001) Proc. Eurospeech , pp. 1711-1714
- Nefti, S.¹ Boëffard, O.²

17
- 85009152114
- Automatic segmentation for Czech concatenative speech synthesis using statistical approach with boundary-specific correction
- Geneva, Switzerland
- J. Matoušek, D. Tihelka, and J. Psutka, "Automatic segmentation for Czech concatenative speech synthesis using statistical approach with boundary-specific correction," in Proc. Eurospeech, Geneva, Switzerland, 2003, pp. 301-304.
- (2003) Proc. Eurospeech , pp. 301-304
- Matoušek, J.¹ Tihelka, D.² Psutka, J.³

18
- 33646814337
- Comparative study of automatic phone segmentation methods for TTS
- Philadelphia, PA
- J. Adell, A. Bonafonte, J. A. Gömez, and M. Castro, "Comparative study of automatic phone segmentation methods for TTS," in Proc. ICASSP, Philadelphia, PA, 2005, vol. I, pp. 309-312.
- (2005) Proc. ICASSP , vol.1 , pp. 309-312
- Adell, J.¹ Bonafonte, A.² Gömez, J.A.³ Castro, M.⁴

19
- 27644558639
- High-accuracy automatic segmentation
- Budapest, Hungary
- J. P. H. van Santen and R. Sproat, "High-accuracy automatic segmentation," in Proc. Eurospeech, Budapest, Hungary, 1999, pp. 2809-2812.
- (1999) Proc. Eurospeech , pp. 2809-2812
- van Santen, J.P.H.¹ Sproat, R.²

20
- 0030364795
- Explicit segmentation of speech using Gaussian models
- Philadelphia, PA
- A. Bonafonte, A. Nogueiras, and A. R. Garrido, "Explicit segmentation of speech using Gaussian models," in Proc. ICSLP, Philadelphia, PA, 1996, pp. 1269-1272.
- (1996) Proc. ICSLP , pp. 1269-1272
- Bonafonte, A.¹ Nogueiras, A.² Garrido, A.R.³

21
- 85009233137
- Refined speech segmentation for concatenative speech synthesis
- Denver, CO
- A. Sethy and S. Narayanam, "Refined speech segmentation for concatenative speech synthesis," in Proc. ICSLP, Denver, CO, 2002, pp. 145-148.
- (2002) Proc. ICSLP , pp. 145-148
- Sethy, A.¹ Narayanam, S.²

22
- 0033351870
- Automatic speech synthesis unit generation with MLP based postprocessor against auto-segmented phoneme errors
- Phoenix, AZ
- E. Y. Park, S. H. Kim, and J. H. Chung, "Automatic speech synthesis unit generation with MLP based postprocessor against auto-segmented phoneme errors," in Proc. ICASSP, Phoenix, AZ, 1999, pp. 2985-2990.
- (1999) Proc. ICASSP , pp. 2985-2990
- Park, E.Y.¹ Kim, S.H.² Chung, J.H.³

23
- 0033694372
- Neural network boundary refining for automatic speech segmentation
- Istanbul, Turkey
- D. T. Toledano, "Neural network boundary refining for automatic speech segmentation," in Proc. ICASSP, Istanbul, Turkey, 2000, pp. 3438-3441.
- (2000) Proc. ICASSP , pp. 3438-3441
- Toledano, D.T.¹

24
- 50549091068
- Local refinement of phonetic boundaries: A general framework and its application using different transition models
- Aalborg, Denmark
- D. T. Toledano and L. A. H. Gömez, "Local refinement of phonetic boundaries: A general framework and its application using different transition models," in Proc. Eurospeech, Aalborg, Denmark, 2001, pp. 1695-1698.
- (2001) Proc. Eurospeech , pp. 1695-1698
- Toledano, D.T.¹ Gömez, L.A.H.²

25
- 34047273929
- MLP-based phone boundary refining for a TTS database
- May
- K. S. Lee, "MLP-based phone boundary refining for a TTS database," IEEE Trans. Audio, Speech, Lang. Process., vol. 14, no. 3, pp. 981-989, May 2006.
- (2006) IEEE Trans. Audio, Speech, Lang. Process , vol.14 , Issue.3 , pp. 981-989
- Lee, K.S.¹

26
- 78649301706
- Statistical corpus-based speech segmentation
- Jeju, Korea
- V. Pollet and G. Coorman, "Statistical corpus-based speech segmentation," in Proc. ICSLP, Jeju, Korea, 2004.
- (2004) Proc. ICSLP
- Pollet, V.¹ Coorman, G.²

27
- 4544373879
- Refining segmental boundaries for TTS database using fine contextual dependent boundary models
- Montreal, QC, Canada
- L. Wang, Y. Zhao, M. Chu, J. Zhou, and Z. Cao, "Refining segmental boundaries for TTS database using fine contextual dependent boundary models," in Proc. ICASSP, Montreal, QC, Canada, 2004, vol. I, pp. 641-644.
- (2004) Proc. ICASSP , vol.1 , pp. 641-644
- Wang, L.¹ Zhao, Y.² Chu, M.³ Zhou, J.⁴ Cao, Z.⁵

28
- 33645780899
- Context- dependent boundary model for refining boundaries segmentation of TTS units
- L. Wang, Y. Zhao, M. Chu, F. K. Soong, J. Zhou, and Z. Cao, "Context- dependent boundary model for refining boundaries segmentation of TTS units," IEICE Trans. Inform. Syst., vol. E89-D, pp. 981-989, 2006.
- (2006) IEICE Trans. Inform. Syst , vol.E89-D , pp. 981-989
- Wang, L.¹ Zhao, Y.² Chu, M.³ Soong, F.K.⁴ Zhou, J.⁵ Cao, Z.⁶

29
- 85009241673
- Automatic segmentation combining an HMM-based approach and spectral boundary correction
- Denver, CO
- Y. J. Kim and A. Conkie, "Automatic segmentation combining an HMM-based approach and spectral boundary correction," in Proc. ICSLP, Denver, CO, 2002, pp. 145-148.
- (2002) Proc. ICSLP , pp. 145-148
- Kim, Y.J.¹ Conkie, A.²

30
- 0346262153
- On the use of F0 features in automatic segmentation for speech synthesis
- Sydney, Australia
- T. Saito, "On the use of F0 features in automatic segmentation for speech synthesis," in Proc. ICSLP, Sydney, Australia, 1998, vol. VII, pp. 2839-2842.
- (1998) Proc. ICSLP , vol.7 , pp. 2839-2842
- Saito, T.¹

31
- 0348206344
- Segment based variable frame rate speech analysis and recognition using a spectral variation function
- Banff, AB, Canada
- G. Flammia, P. Dalsgaard, O. Andersen, and B. Lindberg, "Segment based variable frame rate speech analysis and recognition using a spectral variation function," in Proc. ICSLP, Banff, AB, Canada, 1992, pp. 983-986.
- (1992) Proc. ICSLP , pp. 983-986
- Flammia, G.¹ Dalsgaard, P.² Andersen, O.³ Lindberg, B.⁴

32
- 0028996888
- Using explicit segmentation to improve HMM phone recognition
- Detroit, MI
- C. D. Mitchel, M. P. Harper, and L. H. Jamieson, "Using explicit segmentation to improve HMM phone recognition," in Proc. ICASSP, Detroit, MI, 1995, vol. I, pp. 229-232.
- (1995) Proc. ICASSP , vol.1 , pp. 229-232
- Mitchel, C.D.¹ Harper, M.P.² Jamieson, L.H.³

33
- 85009091555
- A family-of-models approach to HMM-based segmentation for unit selection speech synthesis
- Jeju, Korea
- J. Kominek and A. W. Black, "A family-of-models approach to HMM-based segmentation for unit selection speech synthesis," in Proc. ICSLP, Jeju, Korea, 2004.
- (2004) Proc. ICSLP
- Kominek, J.¹ Black, A.W.²

34
- 33749336407
- Automatic segmentation based on boundarytype candidate selection
- Oct
- S. S. Park and N. S. Kim, "Automatic segmentation based on boundarytype candidate selection," IEEE Signal Process. Lett., vol. 13, no. 10, pp. 640-643, Oct. 2006.
- (2006) IEEE Signal Process. Lett , vol.13 , Issue.10 , pp. 640-643
- Park, S.S.¹ Kim, N.S.²

35
- 0003488911
- 2nd ed. Reading, MA: Addison-Wesley
- D. Luenberger, Linear and Nonlinear Programming, 2nd ed. Reading, MA: Addison-Wesley, 1984, pp. 330-334.
- (1984) Linear and Nonlinear Programming , pp. 330-334
- Luenberger, D.¹

36
- 85009168667
- Evaluating and correcting phoneme segmentation for unit selection synthesis
- Geneva, Switzerland
- J. Kominek, C. Bennett, and A. W. Black, "Evaluating and correcting phoneme segmentation for unit selection synthesis," in Proc. Eurospeech, Geneva, Switzerland, 2003, pp. 313-316.
- (2003) Proc. Eurospeech , pp. 313-316
- Kominek, J.¹ Bennett, C.² Black, A.W.³

37
- 0003802343
- New York: Chapman & Hall
- L. Breiman, J. Friedman, R. Olshen, and C. Stone, Classification and Regression Trees. New York: Chapman & Hall, 1984.
- (1984) Classification and Regression Trees
- Breiman, L.¹ Friedman, J.² Olshen, R.³ Stone, C.⁴

38
- 64149129299
- An improved algorithm for the automatic segmentation of speech corpora
- Las Palmas, Spain
- T. Laureys, K. Demuynck, J. Duchateau, and P. Wambacq, "An improved algorithm for the automatic segmentation of speech corpora," in Proc. LREC, Las Palmas, Spain, 2002, vol. V, pp. 1564-1567.
- (2002) Proc. LREC , vol.5 , pp. 1564-1567
- Laureys, T.¹ Demuynck, K.² Duchateau, J.³ Wambacq, P.⁴

39
- 64149112925
- S.Young, G. Evermann, D.Kershaw, G. Moore, J. Odell, D. Ollason, D. Povey, V. Valtchev, and P.Woodland, The HTK Book for HTK Version 3.2, Cambridge, U.K, Cambrige Univ, 2002
- S.Young, G. Evermann, D.Kershaw, G. Moore, J. Odell, D. Ollason, D. Povey, V. Valtchev, and P.Woodland, The HTK Book (for HTK Version 3.2). Cambridge, U.K.: Cambrige Univ., 2002.

40
- 0003805597
- The Use of Context in Large Vocabulary Speech Recognition,
- Ph.D. dissertation, Cambrige University, Cambridge, U.K
- J. Odell, "The Use of Context in Large Vocabulary Speech Recognition," Ph.D. dissertation, Cambrige University, Cambridge, U.K., 1995.
- (1995)
- Odell, J.¹

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.