메뉴 건너뛰기




Volumn 2015-January, Issue , 2015, Pages 1206-1210

Modulation spectrum-constrained trajectory training algorithm for HMM-based speech synthesis

Author keywords

Global variance; HMM based speech synthesis; Modulation spectrum; Over smoothing; Trajectory training

Indexed keywords

ALGORITHMS; GAUSSIAN DISTRIBUTION; HIDDEN MARKOV MODELS; MARKOV PROCESSES; MODULATION; PARAMETER ESTIMATION; SPEECH; SPEECH COMMUNICATION; SPEECH SYNTHESIS; TRAJECTORIES; TRELLIS CODES;

EID: 84959166270     PISSN: 2308457X     EISSN: 19909772     Source Type: Conference Proceeding    
DOI: None     Document Type: Conference Paper
Times cited : (2)

References (32)
  • 2
    • 85009139544 scopus 로고    scopus 로고
    • Simultaneous modeling of spectrum, pitch and durationin HMM-based speech synthesis
    • Budapest, Hungary, Apr.
    • T. Yoshimura, K. Tokuda, T. Masuko, T. Kobayashi, and T. Kitamura, "Simultaneous modeling of spectrum, pitch and durationin HMM-based speech synthesis, " in Proc. EUROSPEECH, Budapest, Hungary, Apr. 1999, pp. 2347-2350.
    • (1999) Proc. EUROSPEECH , pp. 2347-2350
    • Yoshimura, T.1    Tokuda, K.2    Masuko, T.3    Kobayashi, T.4    Kitamura, T.5
  • 3
    • 0033708106 scopus 로고    scopus 로고
    • Speech parameter generation algorithms for HMM-basedspeech synthesis
    • Istanbul, Turkey, June
    • K. Tokuda, T. Yoshimura, T. Masuko, T. Kobayashi, and T. Kitamura, "Speech parameter generation algorithms for HMM-basedspeech synthesis, " in Proc. ICASSP, Istanbul, Turkey, June 2000, pp. 1315-1318.
    • (2000) Proc. ICASSP , pp. 1315-1318
    • Tokuda, K.1    Yoshimura, T.2    Masuko, T.3    Kobayashi, T.4    Kitamura, T.5
  • 5
    • 33847129573 scopus 로고    scopus 로고
    • Average-voice-based speechsynthesis using HSMM-based speaker adaptation and adaptivetraining
    • J. Yamagishi and T. Kobayashi., "Average-voice-based speechsynthesis using HSMM-based speaker adaptation and adaptivetraining, " IEICE Trans., Inf. and Syst., vol. E90-D, no. 2, pp. 533-543, 2007.
    • (2007) IEICE Trans., Inf. and Syst , vol.E90-D , Issue.2 , pp. 533-543
    • Yamagishi, J.1    Kobayashi, T.2
  • 6
    • 51449114529 scopus 로고    scopus 로고
    • A stylecontrol technique for HMM-based expressive speech synthesis
    • T. Nose, J. Yamagishi, T. Masuko, and T. Kobayashi, "A stylecontrol technique for HMM-based expressive speech synthesis, "IEICE Trans., Inf. and Syst., vol. E90-D, no. 9, pp. 1406-1413, 2007.
    • (2007) IEICE Trans., Inf. and Syst , vol.E90-D , Issue.9 , pp. 1406-1413
    • Nose, T.1    Yamagishi, J.2    Masuko, T.3    Kobayashi, T.4
  • 7
    • 84905234613 scopus 로고    scopus 로고
    • Integration of speaker and pitch adaptive trainingfor HMM-based singing voice synthesis
    • Florence, Italy, May
    • K. Shirota, K. Nakamura, K. Hashimoto, K. Oura, Y. Nankaku, and K. Tokuda, "Integration of speaker and pitch adaptive trainingfor HMM-based singing voice synthesis, " in Proc. ICASSP, Florence, Italy, May 2014, pp. 2578-2582.
    • (2014) Proc. ICASSP , pp. 2578-2582
    • Shirota, K.1    Nakamura, K.2    Hashimoto, K.3    Oura, K.4    Nankaku, Y.5    Tokuda, K.6
  • 8
    • 84855906479 scopus 로고    scopus 로고
    • Speech synthesistechnologies for individuals with vocal diabilities: Voice bankingand reconstruction
    • J. Yamagishi, C. Veaux, S. King, and S. Renals, "Speech synthesistechnologies for individuals with vocal diabilities: Voice bankingand reconstruction, " Acoust. Sci. technol., vol. 33, pp. 1-5, 2012.
    • (2012) Acoust. Sci. Technol , vol.33 , pp. 1-5
    • Yamagishi, J.1    Veaux, C.2    King, S.3    Renals, S.4
  • 9
    • 0023756465 scopus 로고
    • Speech synthesis by rule using an optimal selectionof non-uniform synthesis units
    • New York, U. S. A., Apr.
    • Y. Sagisaka, "Speech synthesis by rule using an optimal selectionof non-uniform synthesis units, " in Proc. ICASSP, New York, U. S. A., Apr. 1988, pp. 679-682.
    • (1988) Proc. ICASSP , pp. 679-682
    • Sagisaka, Y.1
  • 10
    • 84905262874 scopus 로고    scopus 로고
    • Deep mixture density networks for acousticmodeling in statistical parametric speech synthesis
    • Florence, Italy, May
    • H. Zen and A. Senior, "Deep mixture density networks for acousticmodeling in statistical parametric speech synthesis, " in Proc. ICASSP, Florence, Italy, May 2014, pp. 3872-3876.
    • (2014) Proc. ICASSP , pp. 3872-3876
    • Zen, H.1    Senior, A.2
  • 11
    • 84906265592 scopus 로고    scopus 로고
    • Generalizing continuous-space translation of paralinguisticinformation
    • Lyon, France, Aug
    • T. Kano, S. Takamichi, S. Sakti, T. T. G. Neubig, and S. Nakamura, "Generalizing continuous-space translation of paralinguisticinformation, " in Proc. INTERSPEECH, Lyon, France, Aug2013, pp. 2614-2618.
    • (2013) Proc. INTERSPEECH , pp. 2614-2618
    • Kano, T.1    Takamichi, S.2    Sakti, S.3    Neubig, T.T.G.4    Nakamura, S.5
  • 13
    • 84905234422 scopus 로고    scopus 로고
    • Apostfilter to modify modulation spectrum in HMM-based speechsynthesis
    • Florence, Italy, May
    • S. Takamichi, T. Toda, G. Neubig, S. Sakti, and S. Nakamura, "Apostfilter to modify modulation spectrum in HMM-based speechsynthesis, " in Proc. ICASSP, Florence, Italy, May 2014, pp. 290-294.
    • (2014) Proc. ICASSP , pp. 290-294
    • Takamichi, S.1    Toda, T.2    Neubig, G.3    Sakti, S.4    Nakamura, S.5
  • 15
    • 84959144982 scopus 로고    scopus 로고
    • Modified modulation spectrum-based post-filter forHMM-based speech synthesis
    • Atlanta, United States, Dec.
    • -, "Modified modulation spectrum-based post-filter forHMM-based speech synthesis, " in Proc. GlobalSIP, Atlanta, United States, Dec. 2014, pp. 710-714.
    • (2014) Proc. GlobalSIP , pp. 710-714
    • Takamichi, S.1    Toda, T.2    Black, A.W.3    Nakamura, S.4
  • 16
    • 38549096029 scopus 로고    scopus 로고
    • A speech parameter generation algorithmconsidering global variance for HMM-based speech synthesis
    • T. Toda and K. Tokuda, "A speech parameter generation algorithmconsidering global variance for HMM-based speech synthesis, "IEICE Trans., vol. E90-D, no. 5, pp. 816-824, 2007.
    • (2007) IEICE Trans , vol.E90-D , Issue.5 , pp. 816-824
    • Toda, T.1    Tokuda, K.2
  • 17
    • 84946033894 scopus 로고    scopus 로고
    • Parametergeneration algorithm considering modulation spectrum for HMMbasedspeech synthesis
    • Brisbane, Australia, Apr.
    • S. Takamichi, T. Toda, A. W. Black, and S. Nakamura, "Parametergeneration algorithm considering modulation spectrum for HMMbasedspeech synthesis, " in Proc. ICASSP, Brisbane, Australia, Apr. 2015.
    • (2015) Proc. ICASSP
    • Takamichi, S.1    Toda, T.2    Black, A.W.3    Nakamura, S.4
  • 18
    • 84910088495 scopus 로고    scopus 로고
    • Analysis of spectral enhancement usingglobal variance in HMM-based speech synthesis
    • MAXAtria, Singapore, May
    • T. Nose and A. Ito, "Analysis of spectral enhancement usingglobal variance in HMM-based speech synthesis, " in Proc. INTERSPEECH, MAXAtria, Singapore, May 2014, pp. 2917-2921.
    • (2014) Proc. INTERSPEECH , pp. 2917-2921
    • Nose, T.1    Ito, A.2
  • 19
    • 84893234191 scopus 로고    scopus 로고
    • Incorporatingglobal variance in the training phase of GMM-based voiceconversion
    • Kaohsiung, Taiwan, Oct.
    • H. Hwang, Y. Tsao, H. Wang, Y. Wang, and S. Chen, "Incorporatingglobal variance in the training phase of GMM-based voiceconversion, " in Proc. APSIPA, Kaohsiung, Taiwan, Oct. 2013, pp. 1-6.
    • (2013) Proc. APSIPA , pp. 1-6
    • Hwang, H.1    Tsao, Y.2    Wang, H.3    Wang, Y.4    Chen, S.5
  • 20
    • 84890495160 scopus 로고    scopus 로고
    • Fast, low-artifact speech synthesisconsidering global variance
    • Vancouver, Canada, May.
    • M. Shannon and W. Byrne, "Fast, low-artifact speech synthesisconsidering global variance, " in Proc. ICASSP, Vancouver, Canada, May. 2013, pp. 7869-7873.
    • (2013) Proc. ICASSP , pp. 7869-7873
    • Shannon, M.1    Byrne, W.2
  • 21
    • 67650826181 scopus 로고    scopus 로고
    • Trajectory training considering globalvariance for HMM-based speech synthesis
    • Taipei, Taiwan, Aug.
    • T. Toda and S. Young, "Trajectory training considering globalvariance for HMM-based speech synthesis, " in Proc. ICASSP, Taipei, Taiwan, Aug. 2009, pp. 4025-4028.
    • (2009) Proc. ICASSP , pp. 4025-4028
    • Toda, T.1    Young, S.2
  • 22
    • 33749573927 scopus 로고    scopus 로고
    • Refomulating the HMMas a trajectory model by imposing explicit relationships betweenstatic and dynamic feature vector sequences
    • Jan.
    • H. Zen, K. Tokuda, and T. Kitamura, "Refomulating the HMMas a trajectory model by imposing explicit relationships betweenstatic and dynamic feature vector sequences, " Computer Speechand Language, vol. 21, no. 1, pp. 153-173, Jan. 2007.
    • (2007) Computer Speechand Language , vol.21 , Issue.1 , pp. 153-173
    • Zen, H.1    Tokuda, K.2    Kitamura, T.3
  • 23
    • 84946033919 scopus 로고    scopus 로고
    • Modulationspectrum-constrained trajectory training algorithm for GMMbasedvoice conversion
    • Brisbane, Australia, Apr.
    • S. Takamichi, T. Toda, A. W. Black, and S. Nakamura, "Modulationspectrum-constrained trajectory training algorithm for GMMbasedvoice conversion, " in Proc. ICASSP, Brisbane, Australia, Apr. 2015.
    • (2015) Proc. ICASSP
    • Takamichi, S.1    Toda, T.2    Black, A.W.3    Nakamura, S.4
  • 24
    • 57749193836 scopus 로고    scopus 로고
    • Voice conversion based onmaximum likelihood estimation of spectral parameter trajectory
    • T. Toda, A. W. Black, and K. Tokuda, "Voice conversion based onmaximum likelihood estimation of spectral parameter trajectory, "IEEE Transactions on Audio, Speech and Language Processing, vol. 15, no. 8, pp. 2222-2235, 2007.
    • (2007) IEEE Transactions on Audio, Speech and Language Processing , vol.15 , Issue.8 , pp. 2222-2235
    • Toda, T.1    Black, A.W.2    Tokuda, K.3
  • 26
    • 85008023596 scopus 로고    scopus 로고
    • Continuous F0 modeling for HMMbased statistical parametric speech synthesis
    • K. Yu and S. Young, "Continuous F0 modeling for HMMbased statistical parametric speech synthesis, " IEEE Trans. Audio, Speech and Language, vol. 19, no. 5, pp. 1071-1079, 2011.
    • (2011) IEEE Trans. Audio, Speech and Language , vol.19 , Issue.5 , pp. 1071-1079
    • Yu, K.1    Young, S.2
  • 27
    • 44449177634 scopus 로고    scopus 로고
    • Hiddensemi-Markov model based speech synthesis system
    • H. Zen, K. Tokuda, T. K. T. Masuko, and T. Kitamura, "Hiddensemi-Markov model based speech synthesis system, " IEICETrans., Inf. and Syst., E90-D, no. 5, pp. 825-834, 2007.
    • (2007) IEICETrans., Inf. and Syst. , vol.E90-D , Issue.5 , pp. 825-834
    • Zen, H.1    Tokuda, K.2    Masuko, T.K.T.3    Kitamura, T.4
  • 28
    • 33646773080 scopus 로고    scopus 로고
    • Tech. Rep. CMU-LTI-03-177, Language Technologies Institute, Carnegie Mellon University, Pittsburgh, U. S. A.
    • J. Kominek and A. W. Black, "The CMU ARCTIC speechdatabases for speech synthesis research, " in Tech. Rep. CMU-LTI-03-177, Language Technologies Institute, Carnegie Mellon University, Pittsburgh, U. S. A., 2003.
    • (2003) The CMU ARCTIC Speechdatabases for Speech Synthesis Research
    • Kominek, J.1    Black, A.W.2
  • 29
    • 84874199000 scopus 로고    scopus 로고
    • Aperiodicity extractionand control using mixed mode excitation and group delay manipulationfor a high quality speech analysis, modification and synthesissystem STRAIGHT
    • Firentze, Italy, Sept.
    • H. Kawahara, J. Estill, and O. Fujimura, "Aperiodicity extractionand control using mixed mode excitation and group delay manipulationfor a high quality speech analysis, modification and synthesissystem STRAIGHT, " in MAVEBA 2001, Firentze, Italy, Sept. 2001, pp. 1-6.
    • (2001) MAVEBA 2001 , pp. 1-6
    • Kawahara, H.1    Estill, J.2    Fujimura, O.3
  • 30
    • 44949143155 scopus 로고    scopus 로고
    • Maximumlikelihood voice conversion based on GMM with STRAIGHTmixed excitation
    • Pittsburgh, U. S. A., Sep.
    • Y. Ohtani, T. Toda, H. Saruwatari, and K. Shikano, "Maximumlikelihood voice conversion based on GMM with STRAIGHTmixed excitation, " in Proc. INTERSPEECH, Pittsburgh, U. S. A., Sep. 2006, pp. 2266-2269.
    • (2006) Proc. INTERSPEECH , pp. 2266-2269
    • Ohtani, Y.1    Toda, T.2    Saruwatari, H.3    Shikano, K.4
  • 31
    • 0032673049 scopus 로고    scopus 로고
    • Restructuringspeech representations using a pitch-adaptive timefrequencysmoothing and an instantaneous-frequency-based F0extraction: Possible role of a repetitive structure in sounds
    • H. Kawahara, I. Masuda-Katsuse, and A. D. Cheveigne, "Restructuringspeech representations using a pitch-adaptive timefrequencysmoothing and an instantaneous-frequency-based F0extraction: Possible role of a repetitive structure in sounds, "Speech Commun., vol. 27, no. 3-4, pp. 187-207, 1999.
    • (1999) Speech Commun , vol.27 , Issue.3-4 , pp. 187-207
    • Kawahara, H.1    Masuda-Katsuse, I.2    Cheveigne, A.D.3


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.