메뉴 건너뛰기




Volumn 21, Issue 1, 2013, Pages 207-219

Articulatory control of HMM-based parametric speech synthesis using feature-space-switched multiple regression

Author keywords

Articulatory features; Gaussian mixture model; multiple regression hidden Markov model; speech synthesis

Indexed keywords

ACOUSTICS; GAUSSIAN DISTRIBUTION; LINGUISTICS; MATHEMATICAL TRANSFORMATIONS; REGRESSION ANALYSIS; SPEECH SYNTHESIS; TRELLIS CODES;

EID: 84869440340     PISSN: 15587916     EISSN: None     Source Type: Journal    
DOI: 10.1109/TASL.2012.2215600     Document Type: Article
Times cited : (52)

References (34)
  • 1
    • 85009139544 scopus 로고    scopus 로고
    • Simultaneous modeling of spectrum, pitch and duration in HMMbased speech synthesis
    • T. Yoshimura, K. Tokuda, T. Masuko, T. Kobayashi, and T. Kitamura, "Simultaneous modeling of spectrum, pitch and duration in HMMbased speech synthesis," in Proc. Eurospeech, 1999, pp. 2347-2350.
    • (1999) Proc. Eurospeech , pp. 2347-2350
    • Yoshimura, T.1    Tokuda, K.2    Masuko, T.3    Kobayashi, T.4    Kitamura, T.5
  • 2
    • 33645758767 scopus 로고    scopus 로고
    • HMM-based approach to multilingual speech synthesis
    • S.Narayanan and A.Alwan, Eds. Upper Saddle River, NJ: Prentice-Hall
    • K. Tokuda, H. Zen, and A.W. Black, "HMM-based approach to multilingual speech synthesis," in Text to Speech Synthesis: New Paradigms and Advances, S.Narayanan and A.Alwan, Eds. Upper Saddle River, NJ: Prentice-Hall, 2004.
    • (2004) Text to Speech Synthesis: New Paradigms and Advances
    • Tokuda, K.1    Zen, H.2    Black, A.W.3
  • 3
    • 0033708106 scopus 로고    scopus 로고
    • Speech parameter generation algorithms for HMM-based speech synthesis
    • K. Tokuda, T. Yoshimura, T. Masuko, T. Kobayashi, and T. Kitamura, "Speech parameter generation algorithms for HMM-based speech synthesis," in Proc. ICASSP, 2000, vol. 3, pp. 1315-1318.
    • (2000) Proc. ICASSP , vol.3 , pp. 1315-1318
    • Tokuda, K.1    Yoshimura, T.2    Masuko, T.3    Kobayashi, T.4    Kitamura, T.5
  • 4
    • 33846405723 scopus 로고    scopus 로고
    • Details of the nitech HMM-based speech synthesis system for the blizzard challenge 2005
    • DOI 10.1093/ietisy/e90-1.1.325
    • H. Zen, T. Toda, M. Nakamura, and K. Tokuda, "Details of Nitech HMM-based speech synthesis system for the Blizzard Challenge 2005," IEICE Trans. Inf. Syst., vol. E90-D, no. 1, pp. 325-333, 2007. (Pubitemid 46145336)
    • (2007) IEICE Transactions on Information and Systems , vol.E90-D , Issue.1 , pp. 325-333
    • Zen, H.1    Toda, T.2    Nakamura, M.3    Tokuda, K.4
  • 6
    • 33847129573 scopus 로고    scopus 로고
    • Average-voice-based speech synthesis using HSMM-based speaker adaptation and adaptive training
    • DOI 10.1093/ietisy/e90-d.2.533
    • J. Yamagishi and T. Kobayashi, "Average-voice-based speech synthesis using HSMM-based speaker adaptation and adaptive training," IEICE Trans. Inf. Syst., vol. E90-D, no. 2, pp. 533-543, 2007. (Pubitemid 46279829)
    • (2007) IEICE Transactions on Information and Systems , vol.E90-D , Issue.2 , pp. 533-543
    • Yamagishi, J.1    Kobayashi, T.2
  • 8
    • 24144497811 scopus 로고    scopus 로고
    • Acoustic modeling of speaking styles and emotional expressions in HMM-based speech synthesis
    • J. Yamagishi, K. Onishi, T. Masuko, and T. Kobayashi, "Acoustic modeling of speaking styles and emotional expressions in HMM-based speech synthesis," IEICE Trans. Inf. Syst., vol. E88-D, no. 3, pp. 503-509, 2005.
    • (2005) IEICE Trans. Inf. Syst. , vol.E88-D , Issue.3 , pp. 503-509
    • Yamagishi, J.1    Onishi, K.2    Masuko, T.3    Kobayashi, T.4
  • 9
    • 29144475179 scopus 로고    scopus 로고
    • Speech synthesis with various emotional expressions and speaking styles by style interpolation and morphing
    • DOI 10.1093/ietisy/e88-d.11.2484
    • M. Tachibana, J. Yamagishi, T. Masuko, and T. Kobayashi, "Speech synthesis with various emotional expressions and speaking styles by style interpolation and morphing," IEICE Trans. Inf. Syst., vol. E88-D, no. 11, pp. 2484-2491, 2005. (Pubitemid 41816793)
    • (2005) IEICE Transactions on Information and Systems , vol.E88-D , Issue.11 , pp. 2484-2491
    • Tachibana, M.1    Yamagishi, J.2    Masuko, T.3    Kobayashi, T.4
  • 10
    • 51449114529 scopus 로고    scopus 로고
    • A style control technique for HMM-based expressive speech synthesis
    • T. Nose, J. Yamagishi, T. Masuko, and T. Kobayashi, "A style control technique for HMM-based expressive speech synthesis," IEICE Trans. Inf. Syst., vol. E90-D, no. 9, pp. 1406-1413, 2007.
    • (2007) IEICE Trans. Inf. Syst. , vol.E90-D , Issue.9 , pp. 1406-1413
    • Nose, T.1    Yamagishi, J.2    Masuko, T.3    Kobayashi, T.4
  • 11
    • 84867197177 scopus 로고    scopus 로고
    • Articulatory control of HMM-based parametric speech synthesis driven by phonetic knowledge
    • Z.-H. Ling, K. Richmond, J.Yamagishi, and R.-H.Wang, "Articulatory control of HMM-based parametric speech synthesis driven by phonetic knowledge," in Proc. Interspeech '08, 2008, pp. 573-576.
    • (2008) Proc. Interspeech '08 , pp. 573-576
    • Ling, Z.-H.1    Richmond, K.2    Yamagishi, J.3    Wang, R.-H.4
  • 12
    • 68149157315 scopus 로고    scopus 로고
    • Integrating articulatory features into HMM-based parametric speech synthesis
    • Aug.
    • Z.-H. Ling, K. Richmond, J. Yamagishi, and R.-H. Wang, "Integrating articulatory features into HMM-based parametric speech synthesis," IEEE Trans. Audio, Speech, Lang. Process., vol. 17, no. 6, pp. 1171-1185, Aug. 2009.
    • (2009) IEEE Trans. Audio, Speech, Lang. Process. , vol.17 , Issue.6 , pp. 1171-1185
    • Ling, Z.-H.1    Richmond, K.2    Yamagishi, J.3    Wang, R.-H.4
  • 13
    • 0033693063 scopus 로고    scopus 로고
    • Conversational speech recognition using acoustic and articulatory input
    • K. Kirchhoff, G. Fink, and G. Sagerer, "Conversational speech recognition using acoustic and articulatory input," in Proc. ICASSP, 2000, pp. 1435-1438.
    • (2000) Proc. ICASSP , pp. 1435-1438
    • Kirchhoff, K.1    Fink, G.2    Sagerer, G.3
  • 15
    • 0023198186 scopus 로고
    • Electromagnetic articulography: Use of alternating magnetic fields for tracking movements of multiple points inside and outside the vocal tract
    • P. W. Schönle, K. Gräbe, P. Wenig, J. Höhne, J. Schrader, and B. Conrad, "Electromagnetic articulography: Use of alternating magnetic fields for tracking movements of multiple points inside and outside the vocal tract," Brain Lang., vol. 31, pp. 26-35, 1987.
    • (1987) Brain Lang. , vol.31 , pp. 26-35
    • Schönle, P.W.1    Gräbe, K.2    Wenig, P.3    Höhne, J.4    Schrader, J.5    Conrad, B.6
  • 16
    • 0023135474 scopus 로고
    • Application of MRI to the analysis of speech production
    • DOI 10.1016/0730-725X(87)90477-2
    • T. Baer, J. C. Gore, S. Boyce, and P. W. Nye, "Application of MRI to the analysis of speech production," Magn. Resonance Imag., vol. 5, pp. 1-7, 1987. (Pubitemid 17059052)
    • (1987) Magnetic Resonance Imaging , vol.5 , Issue.1 , pp. 1-7
    • Baer, T.1    Gore, J.C.2    Boyce, S.3    Nye, P.W.4
  • 17
    • 0032293271 scopus 로고    scopus 로고
    • Extraction and tracking of the tongue surface from ultrasound image sequences
    • Y. Akgul, C. Kambhamettu, and M. Stone, "Extraction and tracking of the tongue surface from ultrasound image sequences," IEEE Comp. Vis. Pattern Recogn., vol. 124, pp. 298-303, 1998.
    • (1998) IEEE Comp. Vis. Pattern Recogn. , vol.124 , pp. 298-303
    • Akgul, Y.1    Kambhamettu, C.2    Stone, M.3
  • 19
    • 70349205575 scopus 로고    scopus 로고
    • Emotional speech recognition based on style estimation and adaptation with multipleregression HMM
    • Y. Ijima,M. Tachibana, T. Nose, and T. Kobayashi, "Emotional speech recognition based on style estimation and adaptation with multipleregression HMM," in Proc. ICASSP, 2009, pp. 4157-4160.
    • (2009) Proc. ICASSP , pp. 4157-4160
    • Ijima, Y.1    Tachibana, M.2    Nose, T.3    Kobayashi, T.4
  • 20
    • 4344601826 scopus 로고    scopus 로고
    • Quantitative evaluation for skill controller based on comparison with human demonstration
    • Jul.
    • T. Nozaki, T. Suzuki, S. Okuma, K. Itabashi, and F. Fujiwara, "Quantitative evaluation for skill controller based on comparison with human demonstration," IEEE Trans. Control Syst. Technol., vol. 12, no. 4, pp. 609-619, Jul. 2004.
    • (2004) IEEE Trans. Control Syst. Technol. , vol.12 , Issue.4 , pp. 609-619
    • Nozaki, T.1    Suzuki, T.2    Okuma, S.3    Itabashi, K.4    Fujiwara, F.5
  • 21
    • 33646795077 scopus 로고    scopus 로고
    • A quantitative model for formant dynamics and contextually assimilated reduction in fluent speech
    • L. Deng, D. Yu, and A. Acero, "A quantitative model for formant dynamics and contextually assimilated reduction in fluent speech," in Proc. Interspeech, 2004, pp. 719-722.
    • (2004) Proc. Interspeech , pp. 719-722
    • Deng, L.1    Yu, D.2    Acero, A.3
  • 22
    • 79956259003 scopus 로고    scopus 로고
    • Model-based reproduction of articulatory trajectories for consonant-vowel sequences
    • Jul.
    • P. Birkholz, B. Kroger, and C. Neuschaefer-Rube, "Model-based reproduction of articulatory trajectories for consonant-vowel sequences," IEEE Trans. Audio, Speech, Lang. Process., vol. 19, no. 5, pp. 1422-1433, Jul. 2011.
    • (2011) IEEE Trans. Audio, Speech, Lang. Process. , vol.19 , Issue.5 , pp. 1422-1433
    • Birkholz, P.1    Kroger, B.2    Neuschaefer-Rube, C.3
  • 23
    • 2142659020 scopus 로고    scopus 로고
    • Estimation of articulatory movements from speech acoustics using an HMM-based speech production model
    • Mar.
    • S. Hiroya and M. Honda, "Estimation of articulatory movements from speech acoustics using an HMM-based speech production model," IEEE Trans. Speech Audio Process., vol. 12, no. 2, pp. 175-185, Mar. 2004.
    • (2004) IEEE Trans. Speech Audio Process. , vol.12 , Issue.2 , pp. 175-185
    • Hiroya, S.1    Honda, M.2
  • 24
    • 84946757881 scopus 로고    scopus 로고
    • Cross-stream observation dependencies for multi-stream speech recognition
    • Q. Cetin and M. Ostendorf, "Cross-stream observation dependencies for multi-stream speech recognition," in Proc. Eurospeech, 2003, pp. 2517-2520.
    • (2003) Proc. Eurospeech , pp. 2517-2520
    • Cetin, Q.1    Ostendorf, M.2
  • 25
    • 0034227757 scopus 로고    scopus 로고
    • Cluster adaptive training of hidden Markov model
    • Jul.
    • M. Gales, "Cluster adaptive training of hidden Markov model," IEEE Trans. Audio, Speech, Lang. Process., vol. 8, no. 4, pp. 417-428, Jul. 2000.
    • (2000) IEEE Trans. Audio, Speech, Lang. Process. , vol.8 , Issue.4 , pp. 417-428
    • Gales, M.1
  • 29
    • 38649140222 scopus 로고    scopus 로고
    • Statistical mapping between articulatory movements and acoustic spectrum using a Gaussian mixture model
    • DOI 10.1016/j.specom.2007.09.001, PII S0167639307001495
    • T. Toda,W. A. Black, and K. Tokuda, "Statistical mapping between articulatory movements and acoustic spectrum using a Gaussian mixture model," Speech Commun., vol. 50, pp. 215-227, 2008. (Pubitemid 351172471)
    • (2008) Speech Communication , vol.50 , Issue.3 , pp. 215-227
    • Toda, T.1    Black, A.W.2    Tokuda, K.3
  • 30
    • 84865778430 scopus 로고    scopus 로고
    • Announcing the electromagnetic articulography (day 1) subset of the mngu0 articulatory corpus
    • K. Richmond, P. Hoole, and S. King, "Announcing the electromagnetic articulography (day 1) subset of the mngu0 articulatory corpus," in Proc. Interspeech, 2011, pp. 1505-1508.
    • (2011) Proc. Interspeech , pp. 1505-1508
    • Richmond, K.1    Hoole, P.2    King, S.3
  • 31
    • 0032673049 scopus 로고    scopus 로고
    • Restructuring speech representations using pitch-adaptive time-frequency smoothing and an instantaneous-frequency-based F0 extraction: Possible role of a repetitive structure in sounds
    • H. Kawahara, I.Masuda-Katsuse, and A. deCheveigne, "Restructuring speech representations using pitch-adaptive time-frequency smoothing and an instantaneous-frequency-based F0 extraction: Possible role of a repetitive structure in sounds," Speech Commun., vol. 27, pp. 187-207, 1999.
    • (1999) Speech Commun. , vol.27 , pp. 187-207
    • Kawahara, H.1    Masuda-Katsuse, I.2    Decheveigne, A.3
  • 32
    • 77955426622 scopus 로고    scopus 로고
    • An analysis of HMMbased prediction of articulatory movements
    • Z.-H. Ling, K. Richmond, and J. Yamagishi, "An analysis of HMMbased prediction of articulatory movements," Speech Commun., vol. 52, no. 10, pp. 834-846, 2010.
    • (2010) Speech Commun. , vol.52 , Issue.10 , pp. 834-846
    • Ling, Z.-H.1    Richmond, K.2    Yamagishi, J.3
  • 33
    • 84876492203 scopus 로고    scopus 로고
    • Target-filtering model based articulatory movement prediction for articulatory control of HMM-based speech synthesis
    • accepted for publication
    • M.-Q. Cai, Z.-H. Ling, and L.-R. Dai, "Target-filtering model based articulatory movement prediction for articulatory control of HMM-based speech synthesis," in Proc. 11th Int. Conf. Signal Process., 2012, accepted for publication.
    • (2012) Proc. 11th Int. Conf. Signal Process.
    • Cai, M.-Q.1    Ling, Z.-H.2    Dai, L.-R.3
  • 34
    • 84865795806 scopus 로고    scopus 로고
    • Feature-space transform tying in unified acoustic-articulatory modelling for articulatory control of HMM-based speech synthesis
    • Z.-H. Ling, K. Richmond, and J. Yamagishi, "Feature-space transform tying in unified acoustic-articulatory modelling for articulatory control of HMM-based speech synthesis," in Proc. Interspeech, 2011, pp. 117-120.
    • (2011) Proc. Interspeech , pp. 117-120
    • Ling, Z.-H.1    Richmond, K.2    Yamagishi, J.3


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.