메뉴 건너뛰기




Volumn 15, Issue 8, 2007, Pages 2202-2212

On using multiple models for automatic speech segmentation

Author keywords

Automatic speech segmentation; Speech synthesis; Unit selection

Indexed keywords

AUTOMATIC SEGMENTATIONS; AUTOMATIC SPEECH SEGMENTATION; BIAS PARAMETERS; CONTEXT DEPENDENTS; GRADIENT PROJECTION METHODS; MANUAL SEGMENTATIONS; MULTIPLE MODELS; PARAMETER SPACES; SELECTION BASED; SQUARED ERRORS; TEXT-TO-SPEECH SYSTEMS; TRAINING PROCEDURES; UNIT SELECTION; WEIGHTED SUMS;

EID: 64149096086     PISSN: 15587916     EISSN: None     Source Type: Journal    
DOI: 10.1109/TASL.2007.903933     Document Type: Article
Times cited : (27)

References (40)
  • 1
    • 0029765811 scopus 로고    scopus 로고
    • Unit selection in a concatenative speech synthesis system using a large speech database
    • Atlanta, GA
    • A. J. Hunt and A. W. Black, "Unit selection in a concatenative speech synthesis system using a large speech database," in Proc. ICASSP, Atlanta, GA, 1996, pp. 373-376.
    • (1996) Proc. ICASSP , pp. 373-376
    • Hunt, A.J.1    Black, A.W.2
  • 2
    • 0342626572 scopus 로고    scopus 로고
    • The Bell Labs German text-to-speech system
    • B.Möbius, "The Bell Labs German text-to-speech system," in Comput. Speech Lang., 1999, vol. 13, pp. 319-358.
    • (1999) Comput. Speech Lang , vol.13 , pp. 319-358
    • Möbius, B.1
  • 4
    • 84985926077 scopus 로고    scopus 로고
    • Segment selection in the L&H Realspeak laboratory TTS system
    • Beijing, China, Oct
    • G. Coorman, J. Fackrell, P. Rutten, and B. Van Coile, "Segment selection in the L&H Realspeak laboratory TTS system," in Proc. ICSLP, Beijing, China, Oct. 2000, vol. 2, pp. 395-398.
    • (2000) Proc. ICSLP , vol.2 , pp. 395-398
    • Coorman, G.1    Fackrell, J.2    Rutten, P.3    Van Coile, B.4
  • 5
    • 0043271978 scopus 로고    scopus 로고
    • Methods for optimal text selection
    • Rhodes, Greece
    • J. P. H. van Santen and A. L. Buchsbaum, "Methods for optimal text selection," in Proc. Eurospeech, Rhodes, Greece, 1997, pp. 553-556.
    • (1997) Proc. Eurospeech , pp. 553-556
    • van Santen, J.P.H.1    Buchsbaum, A.L.2
  • 6
    • 85135154775 scopus 로고    scopus 로고
    • Combinatorial issues in text-to-speech synthesis
    • Rhodes, Greece
    • J. P. H. van Santen, "Combinatorial issues in text-to-speech synthesis," in Proc. Eurospeech, Rhodes, Greece, 1997, pp. 2511-2514.
    • (1997) Proc. Eurospeech , pp. 2511-2514
    • van Santen, J.P.H.1
  • 7
    • 0000237685 scopus 로고    scopus 로고
    • Prosody and the selection of source units for concatenative synthesis
    • New York: Springer-Verlag
    • W. N. Campbell and A. Black, "Prosody and the selection of source units for concatenative synthesis," in Progress in Speech Synthesis. New York: Springer-Verlag, 1997, pp. 279-292.
    • (1997) Progress in Speech Synthesis , pp. 279-292
    • Campbell, W.N.1    Black, A.2
  • 8
    • 0142153902 scopus 로고    scopus 로고
    • Optimal utterance selection for unit selection speech synthesis databases
    • Norwell, MA: Kluwer
    • A. Black and K. Lenzo, "Optimal utterance selection for unit selection speech synthesis databases," in International Journal of Speech Technology 6. Norwell, MA: Kluwer, 2003, pp. 357-363.
    • (2003) International Journal of Speech Technology 6 , pp. 357-363
    • Black, A.1    Lenzo, K.2
  • 9
    • 0026392350 scopus 로고
    • Automatic segmentation and labeling of speech
    • Toronto, ON, Canada
    • A. Ljolje and M. D. Riley, "Automatic segmentation and labeling of speech," in Proc. ICASSP, Toronto, ON, Canada, 1992, pp. 473-476.
    • (1992) Proc. ICASSP , pp. 473-476
    • Ljolje, A.1    Riley, M.D.2
  • 10
    • 64149090049 scopus 로고
    • A HMM-based system for automatic segmentation and labelling of speech
    • Banff, Canada
    • F. Brugnara, D. Falavigna, and M. Omologo, "A HMM-based system for automatic segmentation and labelling of speech," in Proc. ICSLP, Banff, Canada, 1992, pp. 803-806.
    • (1992) Proc. ICSLP , pp. 803-806
    • Brugnara, F.1    Falavigna, D.2    Omologo, M.3
  • 11
    • 0004131347 scopus 로고    scopus 로고
    • Trainable Speech Synthesis,
    • Ph.D. dissertation, Cambrige Univ, Cambridge, U.K
    • R. E. Donovan, "Trainable Speech Synthesis," Ph.D. dissertation, Cambrige Univ., Cambridge, U.K., 1996.
    • (1996)
    • Donovan, R.E.1
  • 12
    • 19944409831 scopus 로고    scopus 로고
    • Unsupervised, language-independent grapheme-tophoneme conversion by latent analogy
    • J. R. Bellegarda, "Unsupervised, language-independent grapheme-tophoneme conversion by latent analogy," Speech Commun., vol. 46/2, pp. 140-152, 2005.
    • (2005) Speech Commun , vol.46 , Issue.2 , pp. 140-152
    • Bellegarda, J.R.1
  • 13
    • 38148999392 scopus 로고    scopus 로고
    • Automatic phonetic transcription of large speech corpora: A comparative study
    • Pittsburgh, PA
    • C. Van Bael, L. Boves, H. van den Heuvel, and H. Strik, "Automatic phonetic transcription of large speech corpora: A comparative study," in Proc. ICSLP, Pittsburgh, PA, 2006, pp. 1085-1088.
    • (2006) Proc. ICSLP , pp. 1085-1088
    • Van Bael, C.1    Boves, L.2    van den Heuvel, H.3    Strik, H.4
  • 15
    • 4544324769 scopus 로고    scopus 로고
    • An evaluation of automatic phone segmentation for concatenative speech synthesis
    • Montreal, QC, Canada
    • H. Kawai and T. Toda, "An evaluation of automatic phone segmentation for concatenative speech synthesis," in Proc. ICASSP, Montreal, QC, Canada, 2004, vol. I, pp. 677-680.
    • (2004) Proc. ICASSP , vol.1 , pp. 677-680
    • Kawai, H.1    Toda, T.2
  • 16
    • 64149113705 scopus 로고    scopus 로고
    • Acoustical and topological experiments for an HMM-based speech segementation system
    • Aalborg, Denmark
    • S. Nefti and O. Boëffard, "Acoustical and topological experiments for an HMM-based speech segementation system," in Proc. Eurospeech, Aalborg, Denmark, 2001, pp. 1711-1714.
    • (2001) Proc. Eurospeech , pp. 1711-1714
    • Nefti, S.1    Boëffard, O.2
  • 17
    • 85009152114 scopus 로고    scopus 로고
    • Automatic segmentation for Czech concatenative speech synthesis using statistical approach with boundary-specific correction
    • Geneva, Switzerland
    • J. Matoušek, D. Tihelka, and J. Psutka, "Automatic segmentation for Czech concatenative speech synthesis using statistical approach with boundary-specific correction," in Proc. Eurospeech, Geneva, Switzerland, 2003, pp. 301-304.
    • (2003) Proc. Eurospeech , pp. 301-304
    • Matoušek, J.1    Tihelka, D.2    Psutka, J.3
  • 18
    • 33646814337 scopus 로고    scopus 로고
    • Comparative study of automatic phone segmentation methods for TTS
    • Philadelphia, PA
    • J. Adell, A. Bonafonte, J. A. Gömez, and M. Castro, "Comparative study of automatic phone segmentation methods for TTS," in Proc. ICASSP, Philadelphia, PA, 2005, vol. I, pp. 309-312.
    • (2005) Proc. ICASSP , vol.1 , pp. 309-312
    • Adell, J.1    Bonafonte, A.2    Gömez, J.A.3    Castro, M.4
  • 19
    • 27644558639 scopus 로고    scopus 로고
    • High-accuracy automatic segmentation
    • Budapest, Hungary
    • J. P. H. van Santen and R. Sproat, "High-accuracy automatic segmentation," in Proc. Eurospeech, Budapest, Hungary, 1999, pp. 2809-2812.
    • (1999) Proc. Eurospeech , pp. 2809-2812
    • van Santen, J.P.H.1    Sproat, R.2
  • 20
    • 0030364795 scopus 로고    scopus 로고
    • Explicit segmentation of speech using Gaussian models
    • Philadelphia, PA
    • A. Bonafonte, A. Nogueiras, and A. R. Garrido, "Explicit segmentation of speech using Gaussian models," in Proc. ICSLP, Philadelphia, PA, 1996, pp. 1269-1272.
    • (1996) Proc. ICSLP , pp. 1269-1272
    • Bonafonte, A.1    Nogueiras, A.2    Garrido, A.R.3
  • 21
    • 85009233137 scopus 로고    scopus 로고
    • Refined speech segmentation for concatenative speech synthesis
    • Denver, CO
    • A. Sethy and S. Narayanam, "Refined speech segmentation for concatenative speech synthesis," in Proc. ICSLP, Denver, CO, 2002, pp. 145-148.
    • (2002) Proc. ICSLP , pp. 145-148
    • Sethy, A.1    Narayanam, S.2
  • 22
    • 0033351870 scopus 로고    scopus 로고
    • Automatic speech synthesis unit generation with MLP based postprocessor against auto-segmented phoneme errors
    • Phoenix, AZ
    • E. Y. Park, S. H. Kim, and J. H. Chung, "Automatic speech synthesis unit generation with MLP based postprocessor against auto-segmented phoneme errors," in Proc. ICASSP, Phoenix, AZ, 1999, pp. 2985-2990.
    • (1999) Proc. ICASSP , pp. 2985-2990
    • Park, E.Y.1    Kim, S.H.2    Chung, J.H.3
  • 23
    • 0033694372 scopus 로고    scopus 로고
    • Neural network boundary refining for automatic speech segmentation
    • Istanbul, Turkey
    • D. T. Toledano, "Neural network boundary refining for automatic speech segmentation," in Proc. ICASSP, Istanbul, Turkey, 2000, pp. 3438-3441.
    • (2000) Proc. ICASSP , pp. 3438-3441
    • Toledano, D.T.1
  • 24
    • 50549091068 scopus 로고    scopus 로고
    • Local refinement of phonetic boundaries: A general framework and its application using different transition models
    • Aalborg, Denmark
    • D. T. Toledano and L. A. H. Gömez, "Local refinement of phonetic boundaries: A general framework and its application using different transition models," in Proc. Eurospeech, Aalborg, Denmark, 2001, pp. 1695-1698.
    • (2001) Proc. Eurospeech , pp. 1695-1698
    • Toledano, D.T.1    Gömez, L.A.H.2
  • 25
    • 34047273929 scopus 로고    scopus 로고
    • MLP-based phone boundary refining for a TTS database
    • May
    • K. S. Lee, "MLP-based phone boundary refining for a TTS database," IEEE Trans. Audio, Speech, Lang. Process., vol. 14, no. 3, pp. 981-989, May 2006.
    • (2006) IEEE Trans. Audio, Speech, Lang. Process , vol.14 , Issue.3 , pp. 981-989
    • Lee, K.S.1
  • 26
    • 78649301706 scopus 로고    scopus 로고
    • Statistical corpus-based speech segmentation
    • Jeju, Korea
    • V. Pollet and G. Coorman, "Statistical corpus-based speech segmentation," in Proc. ICSLP, Jeju, Korea, 2004.
    • (2004) Proc. ICSLP
    • Pollet, V.1    Coorman, G.2
  • 27
    • 4544373879 scopus 로고    scopus 로고
    • Refining segmental boundaries for TTS database using fine contextual dependent boundary models
    • Montreal, QC, Canada
    • L. Wang, Y. Zhao, M. Chu, J. Zhou, and Z. Cao, "Refining segmental boundaries for TTS database using fine contextual dependent boundary models," in Proc. ICASSP, Montreal, QC, Canada, 2004, vol. I, pp. 641-644.
    • (2004) Proc. ICASSP , vol.1 , pp. 641-644
    • Wang, L.1    Zhao, Y.2    Chu, M.3    Zhou, J.4    Cao, Z.5
  • 28
    • 33645780899 scopus 로고    scopus 로고
    • Context- dependent boundary model for refining boundaries segmentation of TTS units
    • L. Wang, Y. Zhao, M. Chu, F. K. Soong, J. Zhou, and Z. Cao, "Context- dependent boundary model for refining boundaries segmentation of TTS units," IEICE Trans. Inform. Syst., vol. E89-D, pp. 981-989, 2006.
    • (2006) IEICE Trans. Inform. Syst , vol.E89-D , pp. 981-989
    • Wang, L.1    Zhao, Y.2    Chu, M.3    Soong, F.K.4    Zhou, J.5    Cao, Z.6
  • 29
    • 85009241673 scopus 로고    scopus 로고
    • Automatic segmentation combining an HMM-based approach and spectral boundary correction
    • Denver, CO
    • Y. J. Kim and A. Conkie, "Automatic segmentation combining an HMM-based approach and spectral boundary correction," in Proc. ICSLP, Denver, CO, 2002, pp. 145-148.
    • (2002) Proc. ICSLP , pp. 145-148
    • Kim, Y.J.1    Conkie, A.2
  • 30
    • 0346262153 scopus 로고    scopus 로고
    • On the use of F0 features in automatic segmentation for speech synthesis
    • Sydney, Australia
    • T. Saito, "On the use of F0 features in automatic segmentation for speech synthesis," in Proc. ICSLP, Sydney, Australia, 1998, vol. VII, pp. 2839-2842.
    • (1998) Proc. ICSLP , vol.7 , pp. 2839-2842
    • Saito, T.1
  • 31
    • 0348206344 scopus 로고
    • Segment based variable frame rate speech analysis and recognition using a spectral variation function
    • Banff, AB, Canada
    • G. Flammia, P. Dalsgaard, O. Andersen, and B. Lindberg, "Segment based variable frame rate speech analysis and recognition using a spectral variation function," in Proc. ICSLP, Banff, AB, Canada, 1992, pp. 983-986.
    • (1992) Proc. ICSLP , pp. 983-986
    • Flammia, G.1    Dalsgaard, P.2    Andersen, O.3    Lindberg, B.4
  • 32
    • 0028996888 scopus 로고
    • Using explicit segmentation to improve HMM phone recognition
    • Detroit, MI
    • C. D. Mitchel, M. P. Harper, and L. H. Jamieson, "Using explicit segmentation to improve HMM phone recognition," in Proc. ICASSP, Detroit, MI, 1995, vol. I, pp. 229-232.
    • (1995) Proc. ICASSP , vol.1 , pp. 229-232
    • Mitchel, C.D.1    Harper, M.P.2    Jamieson, L.H.3
  • 33
    • 85009091555 scopus 로고    scopus 로고
    • A family-of-models approach to HMM-based segmentation for unit selection speech synthesis
    • Jeju, Korea
    • J. Kominek and A. W. Black, "A family-of-models approach to HMM-based segmentation for unit selection speech synthesis," in Proc. ICSLP, Jeju, Korea, 2004.
    • (2004) Proc. ICSLP
    • Kominek, J.1    Black, A.W.2
  • 34
    • 33749336407 scopus 로고    scopus 로고
    • Automatic segmentation based on boundarytype candidate selection
    • Oct
    • S. S. Park and N. S. Kim, "Automatic segmentation based on boundarytype candidate selection," IEEE Signal Process. Lett., vol. 13, no. 10, pp. 640-643, Oct. 2006.
    • (2006) IEEE Signal Process. Lett , vol.13 , Issue.10 , pp. 640-643
    • Park, S.S.1    Kim, N.S.2
  • 36
    • 85009168667 scopus 로고    scopus 로고
    • Evaluating and correcting phoneme segmentation for unit selection synthesis
    • Geneva, Switzerland
    • J. Kominek, C. Bennett, and A. W. Black, "Evaluating and correcting phoneme segmentation for unit selection synthesis," in Proc. Eurospeech, Geneva, Switzerland, 2003, pp. 313-316.
    • (2003) Proc. Eurospeech , pp. 313-316
    • Kominek, J.1    Bennett, C.2    Black, A.W.3
  • 38
    • 64149129299 scopus 로고    scopus 로고
    • An improved algorithm for the automatic segmentation of speech corpora
    • Las Palmas, Spain
    • T. Laureys, K. Demuynck, J. Duchateau, and P. Wambacq, "An improved algorithm for the automatic segmentation of speech corpora," in Proc. LREC, Las Palmas, Spain, 2002, vol. V, pp. 1564-1567.
    • (2002) Proc. LREC , vol.5 , pp. 1564-1567
    • Laureys, T.1    Demuynck, K.2    Duchateau, J.3    Wambacq, P.4
  • 39
    • 64149112925 scopus 로고    scopus 로고
    • S.Young, G. Evermann, D.Kershaw, G. Moore, J. Odell, D. Ollason, D. Povey, V. Valtchev, and P.Woodland, The HTK Book for HTK Version 3.2, Cambridge, U.K, Cambrige Univ, 2002
    • S.Young, G. Evermann, D.Kershaw, G. Moore, J. Odell, D. Ollason, D. Povey, V. Valtchev, and P.Woodland, The HTK Book (for HTK Version 3.2). Cambridge, U.K.: Cambrige Univ., 2002.
  • 40
    • 0003805597 scopus 로고
    • The Use of Context in Large Vocabulary Speech Recognition,
    • Ph.D. dissertation, Cambrige University, Cambridge, U.K
    • J. Odell, "The Use of Context in Large Vocabulary Speech Recognition," Ph.D. dissertation, Cambrige University, Cambridge, U.K., 1995.
    • (1995)
    • Odell, J.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.