SCOPUS 정보 검색 플랫폼

Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH

Volumn 2015-January, Issue , 2015, Pages 1206-1210

Modulation spectrum-constrained trajectory training algorithm for HMM-based speech synthesis

(4) Takamichi, Shinnosuke a,b Toda, Tomoki a Black, Alan W b Nakamura, Satoshi a

a NARA INSTITUTE OF SCIENCE AND TECHNOLOGY (Japan)

b Carnegie Mellon University ^* (United States)

Author keywords

Global variance; HMM based speech synthesis; Modulation spectrum; Over smoothing; Trajectory training

Indexed keywords

ALGORITHMS; GAUSSIAN DISTRIBUTION; HIDDEN MARKOV MODELS; MARKOV PROCESSES; MODULATION; PARAMETER ESTIMATION; SPEECH; SPEECH COMMUNICATION; SPEECH SYNTHESIS; TRAJECTORIES; TRELLIS CODES;

COMPUTATIONALLY EFFICIENT; CONSTRAINED TRAJECTORIES; CONVENTIONAL GENERATION; GAUSSIAN MIXTURE MODEL; GLOBAL VARIANCE; HMM-BASED SPEECH SYNTHESIS; MODULATION SPECTRUM; OVER-SMOOTHING;

PLASMA DIAGNOSTICS;

EID: 84959166270 PISSN: 2308457X EISSN: 19909772 Source Type: Conference Proceeding
DOI: None Document Type: Conference Paper

Times cited : (2)

References (32)

1
- 84876687945
- Speech synthesis based on hidden Markov models
- K. Tokuda, Y. Nankaku, T. Toda, H. Zen, J. Yamagishi, and K. Oura, "Speech synthesis based on hidden Markov models, "Proceedings of the IEEE, vol. 101, no. 5, pp. 1234-1252, 2013.
- (2013) Proceedings of the IEEE , vol.101 , Issue.5 , pp. 1234-1252
- Tokuda, K.¹ Nankaku, Y.² Toda, T.³ Zen, H.⁴ Yamagishi, J.⁵ Oura, K.⁶

2
- 85009139544
- Simultaneous modeling of spectrum, pitch and durationin HMM-based speech synthesis
- Budapest, Hungary, Apr.
- T. Yoshimura, K. Tokuda, T. Masuko, T. Kobayashi, and T. Kitamura, "Simultaneous modeling of spectrum, pitch and durationin HMM-based speech synthesis, " in Proc. EUROSPEECH, Budapest, Hungary, Apr. 1999, pp. 2347-2350.
- (1999) Proc. EUROSPEECH , pp. 2347-2350
- Yoshimura, T.¹ Tokuda, K.² Masuko, T.³ Kobayashi, T.⁴ Kitamura, T.⁵

3
- 0033708106
- Speech parameter generation algorithms for HMM-basedspeech synthesis
- Istanbul, Turkey, June
- K. Tokuda, T. Yoshimura, T. Masuko, T. Kobayashi, and T. Kitamura, "Speech parameter generation algorithms for HMM-basedspeech synthesis, " in Proc. ICASSP, Istanbul, Turkey, June 2000, pp. 1315-1318.
- (2000) Proc. ICASSP , pp. 1315-1318
- Tokuda, K.¹ Yoshimura, T.² Masuko, T.³ Kobayashi, T.⁴ Kitamura, T.⁵

4
- 0034230270
- Speaker interpolation for HMM-based speech synthesissystem
- T. Yoshimura, T. Masuko, K. Tokuda, T. Kobayashi, and T. Kitamura, "Speaker interpolation for HMM-based speech synthesissystem, " J. Acoust. Soc. Jpn. (E), vol. 21, no. 4, pp. 199-206, 2000.
- (2000) J. Acoust. Soc. Jpn. (E) , vol.21 , Issue.4 , pp. 199-206
- Yoshimura, T.¹ Masuko, T.² Tokuda, K.³ Kobayashi, T.⁴ Kitamura, T.⁵

5
- 33847129573
- Average-voice-based speechsynthesis using HSMM-based speaker adaptation and adaptivetraining
- J. Yamagishi and T. Kobayashi., "Average-voice-based speechsynthesis using HSMM-based speaker adaptation and adaptivetraining, " IEICE Trans., Inf. and Syst., vol. E90-D, no. 2, pp. 533-543, 2007.
- (2007) IEICE Trans., Inf. and Syst , vol.E90-D , Issue.2 , pp. 533-543
- Yamagishi, J.¹ Kobayashi, T.²

6
- 51449114529
- A stylecontrol technique for HMM-based expressive speech synthesis
- T. Nose, J. Yamagishi, T. Masuko, and T. Kobayashi, "A stylecontrol technique for HMM-based expressive speech synthesis, "IEICE Trans., Inf. and Syst., vol. E90-D, no. 9, pp. 1406-1413, 2007.
- (2007) IEICE Trans., Inf. and Syst , vol.E90-D , Issue.9 , pp. 1406-1413
- Nose, T.¹ Yamagishi, J.² Masuko, T.³ Kobayashi, T.⁴

7
- 84905234613
- Integration of speaker and pitch adaptive trainingfor HMM-based singing voice synthesis
- Florence, Italy, May
- K. Shirota, K. Nakamura, K. Hashimoto, K. Oura, Y. Nankaku, and K. Tokuda, "Integration of speaker and pitch adaptive trainingfor HMM-based singing voice synthesis, " in Proc. ICASSP, Florence, Italy, May 2014, pp. 2578-2582.
- (2014) Proc. ICASSP , pp. 2578-2582
- Shirota, K.¹ Nakamura, K.² Hashimoto, K.³ Oura, K.⁴ Nankaku, Y.⁵ Tokuda, K.⁶

8
- 84855906479
- Speech synthesistechnologies for individuals with vocal diabilities: Voice bankingand reconstruction
- J. Yamagishi, C. Veaux, S. King, and S. Renals, "Speech synthesistechnologies for individuals with vocal diabilities: Voice bankingand reconstruction, " Acoust. Sci. technol., vol. 33, pp. 1-5, 2012.
- (2012) Acoust. Sci. Technol , vol.33 , pp. 1-5
- Yamagishi, J.¹ Veaux, C.² King, S.³ Renals, S.⁴

9
- 0023756465
- Speech synthesis by rule using an optimal selectionof non-uniform synthesis units
- New York, U. S. A., Apr.
- Y. Sagisaka, "Speech synthesis by rule using an optimal selectionof non-uniform synthesis units, " in Proc. ICASSP, New York, U. S. A., Apr. 1988, pp. 679-682.
- (1988) Proc. ICASSP , pp. 679-682
- Sagisaka, Y.¹

10
- 84905262874
- Deep mixture density networks for acousticmodeling in statistical parametric speech synthesis
- Florence, Italy, May
- H. Zen and A. Senior, "Deep mixture density networks for acousticmodeling in statistical parametric speech synthesis, " in Proc. ICASSP, Florence, Italy, May 2014, pp. 3872-3876.
- (2014) Proc. ICASSP , pp. 3872-3876
- Zen, H.¹ Senior, A.²

11
- 84906265592
- Generalizing continuous-space translation of paralinguisticinformation
- Lyon, France, Aug
- T. Kano, S. Takamichi, S. Sakti, T. T. G. Neubig, and S. Nakamura, "Generalizing continuous-space translation of paralinguisticinformation, " in Proc. INTERSPEECH, Lyon, France, Aug2013, pp. 2614-2618.
- (2013) Proc. INTERSPEECH , pp. 2614-2618
- Kano, T.¹ Takamichi, S.² Sakti, S.³ Neubig, T.T.G.⁴ Nakamura, S.⁵

12
- 84878419996
- The blizzard challenge 2011
- Turin, Italy, Sept.
- S. King and V. Karaiskos, "The blizzard challenge 2011, " in Proc. Blizzard Challenge workshop, Turin, Italy, Sept. 2011.
- (2011) Proc. Blizzard Challenge Workshop
- King, S.¹ Karaiskos, V.²

13
- 84905234422
- Apostfilter to modify modulation spectrum in HMM-based speechsynthesis
- Florence, Italy, May
- S. Takamichi, T. Toda, G. Neubig, S. Sakti, and S. Nakamura, "Apostfilter to modify modulation spectrum in HMM-based speechsynthesis, " in Proc. ICASSP, Florence, Italy, May 2014, pp. 290-294.
- (2014) Proc. ICASSP , pp. 290-294
- Takamichi, S.¹ Toda, T.² Neubig, G.³ Sakti, S.⁴ Nakamura, S.⁵

14
- 84949926049
- Modulationspectrum-based post-filter for GMM-based voice conversion
- Cambodia, Dec.
- S. Takamichi, T. Toda, A. W. Black, and S. Nakamura, "Modulationspectrum-based post-filter for GMM-based voice conversion, "in Proc. APSIPA ASC, Siem Reap, Cambodia, Dec. 2014.
- (2014) Proc. APSIPA ASC, Siem Reap
- Takamichi, S.¹ Toda, T.² Black, A.W.³ Nakamura, S.⁴

15
- 84959144982
- Modified modulation spectrum-based post-filter forHMM-based speech synthesis
- Atlanta, United States, Dec.
- -, "Modified modulation spectrum-based post-filter forHMM-based speech synthesis, " in Proc. GlobalSIP, Atlanta, United States, Dec. 2014, pp. 710-714.
- (2014) Proc. GlobalSIP , pp. 710-714
- Takamichi, S.¹ Toda, T.² Black, A.W.³ Nakamura, S.⁴

16
- 38549096029
- A speech parameter generation algorithmconsidering global variance for HMM-based speech synthesis
- T. Toda and K. Tokuda, "A speech parameter generation algorithmconsidering global variance for HMM-based speech synthesis, "IEICE Trans., vol. E90-D, no. 5, pp. 816-824, 2007.
- (2007) IEICE Trans , vol.E90-D , Issue.5 , pp. 816-824
- Toda, T.¹ Tokuda, K.²

17
- 84946033894
- Parametergeneration algorithm considering modulation spectrum for HMMbasedspeech synthesis
- Brisbane, Australia, Apr.
- S. Takamichi, T. Toda, A. W. Black, and S. Nakamura, "Parametergeneration algorithm considering modulation spectrum for HMMbasedspeech synthesis, " in Proc. ICASSP, Brisbane, Australia, Apr. 2015.
- (2015) Proc. ICASSP
- Takamichi, S.¹ Toda, T.² Black, A.W.³ Nakamura, S.⁴

18
- 84910088495
- Analysis of spectral enhancement usingglobal variance in HMM-based speech synthesis
- MAXAtria, Singapore, May
- T. Nose and A. Ito, "Analysis of spectral enhancement usingglobal variance in HMM-based speech synthesis, " in Proc. INTERSPEECH, MAXAtria, Singapore, May 2014, pp. 2917-2921.
- (2014) Proc. INTERSPEECH , pp. 2917-2921
- Nose, T.¹ Ito, A.²

19
- 84893234191
- Incorporatingglobal variance in the training phase of GMM-based voiceconversion
- Kaohsiung, Taiwan, Oct.
- H. Hwang, Y. Tsao, H. Wang, Y. Wang, and S. Chen, "Incorporatingglobal variance in the training phase of GMM-based voiceconversion, " in Proc. APSIPA, Kaohsiung, Taiwan, Oct. 2013, pp. 1-6.
- (2013) Proc. APSIPA , pp. 1-6
- Hwang, H.¹ Tsao, Y.² Wang, H.³ Wang, Y.⁴ Chen, S.⁵

20
- 84890495160
- Fast, low-artifact speech synthesisconsidering global variance
- Vancouver, Canada, May.
- M. Shannon and W. Byrne, "Fast, low-artifact speech synthesisconsidering global variance, " in Proc. ICASSP, Vancouver, Canada, May. 2013, pp. 7869-7873.
- (2013) Proc. ICASSP , pp. 7869-7873
- Shannon, M.¹ Byrne, W.²

21
- 67650826181
- Trajectory training considering globalvariance for HMM-based speech synthesis
- Taipei, Taiwan, Aug.
- T. Toda and S. Young, "Trajectory training considering globalvariance for HMM-based speech synthesis, " in Proc. ICASSP, Taipei, Taiwan, Aug. 2009, pp. 4025-4028.
- (2009) Proc. ICASSP , pp. 4025-4028
- Toda, T.¹ Young, S.²

22
- 33749573927
- Refomulating the HMMas a trajectory model by imposing explicit relationships betweenstatic and dynamic feature vector sequences
- Jan.
- H. Zen, K. Tokuda, and T. Kitamura, "Refomulating the HMMas a trajectory model by imposing explicit relationships betweenstatic and dynamic feature vector sequences, " Computer Speechand Language, vol. 21, no. 1, pp. 153-173, Jan. 2007.
- (2007) Computer Speechand Language , vol.21 , Issue.1 , pp. 153-173
- Zen, H.¹ Tokuda, K.² Kitamura, T.³

23
- 84946033919
- Modulationspectrum-constrained trajectory training algorithm for GMMbasedvoice conversion
- Brisbane, Australia, Apr.
- S. Takamichi, T. Toda, A. W. Black, and S. Nakamura, "Modulationspectrum-constrained trajectory training algorithm for GMMbasedvoice conversion, " in Proc. ICASSP, Brisbane, Australia, Apr. 2015.
- (2015) Proc. ICASSP
- Takamichi, S.¹ Toda, T.² Black, A.W.³ Nakamura, S.⁴

24
- 57749193836
- Voice conversion based onmaximum likelihood estimation of spectral parameter trajectory
- T. Toda, A. W. Black, and K. Tokuda, "Voice conversion based onmaximum likelihood estimation of spectral parameter trajectory, "IEEE Transactions on Audio, Speech and Language Processing, vol. 15, no. 8, pp. 2222-2235, 2007.
- (2007) IEEE Transactions on Audio, Speech and Language Processing , vol.15 , Issue.8 , pp. 2222-2235
- Toda, T.¹ Black, A.W.² Tokuda, K.³

25
- 0036522887
- Multispaceprobability distribution HMM
- K. Tokuda, T. Masuko, B. Miyazaki, and T. Kobayashi, "Multispaceprobability distribution HMM, " IEICE Trans., Inf. and Syst., vol. E85-D, no. 3, pp. 455-464, 2002.
- (2002) IEICE Trans., Inf. and Syst. , vol.E85-D , Issue.3 , pp. 455-464
- Tokuda, K.¹ Masuko, T.² Miyazaki, B.³ Kobayashi, T.⁴

26
- 85008023596
- Continuous F0 modeling for HMMbased statistical parametric speech synthesis
- K. Yu and S. Young, "Continuous F0 modeling for HMMbased statistical parametric speech synthesis, " IEEE Trans. Audio, Speech and Language, vol. 19, no. 5, pp. 1071-1079, 2011.
- (2011) IEEE Trans. Audio, Speech and Language , vol.19 , Issue.5 , pp. 1071-1079
- Yu, K.¹ Young, S.²

27
- 44449177634
- Hiddensemi-Markov model based speech synthesis system
- H. Zen, K. Tokuda, T. K. T. Masuko, and T. Kitamura, "Hiddensemi-Markov model based speech synthesis system, " IEICETrans., Inf. and Syst., E90-D, no. 5, pp. 825-834, 2007.
- (2007) IEICETrans., Inf. and Syst. , vol.E90-D , Issue.5 , pp. 825-834
- Zen, H.¹ Tokuda, K.² Masuko, T.K.T.³ Kitamura, T.⁴

28
- 33646773080
- Tech. Rep. CMU-LTI-03-177, Language Technologies Institute, Carnegie Mellon University, Pittsburgh, U. S. A.
- J. Kominek and A. W. Black, "The CMU ARCTIC speechdatabases for speech synthesis research, " in Tech. Rep. CMU-LTI-03-177, Language Technologies Institute, Carnegie Mellon University, Pittsburgh, U. S. A., 2003.
- (2003) The CMU ARCTIC Speechdatabases for Speech Synthesis Research
- Kominek, J.¹ Black, A.W.²

29
- 84874199000
- Aperiodicity extractionand control using mixed mode excitation and group delay manipulationfor a high quality speech analysis, modification and synthesissystem STRAIGHT
- Firentze, Italy, Sept.
- H. Kawahara, J. Estill, and O. Fujimura, "Aperiodicity extractionand control using mixed mode excitation and group delay manipulationfor a high quality speech analysis, modification and synthesissystem STRAIGHT, " in MAVEBA 2001, Firentze, Italy, Sept. 2001, pp. 1-6.
- (2001) MAVEBA 2001 , pp. 1-6
- Kawahara, H.¹ Estill, J.² Fujimura, O.³

30
- 44949143155
- Maximumlikelihood voice conversion based on GMM with STRAIGHTmixed excitation
- Pittsburgh, U. S. A., Sep.
- Y. Ohtani, T. Toda, H. Saruwatari, and K. Shikano, "Maximumlikelihood voice conversion based on GMM with STRAIGHTmixed excitation, " in Proc. INTERSPEECH, Pittsburgh, U. S. A., Sep. 2006, pp. 2266-2269.
- (2006) Proc. INTERSPEECH , pp. 2266-2269
- Ohtani, Y.¹ Toda, T.² Saruwatari, H.³ Shikano, K.⁴

31
- 0032673049
- Restructuringspeech representations using a pitch-adaptive timefrequencysmoothing and an instantaneous-frequency-based F0extraction: Possible role of a repetitive structure in sounds
- H. Kawahara, I. Masuda-Katsuse, and A. D. Cheveigne, "Restructuringspeech representations using a pitch-adaptive timefrequencysmoothing and an instantaneous-frequency-based F0extraction: Possible role of a repetitive structure in sounds, "Speech Commun., vol. 27, no. 3-4, pp. 187-207, 1999.
- (1999) Speech Commun , vol.27 , Issue.3-4 , pp. 187-207
- Kawahara, H.¹ Masuda-Katsuse, I.² Cheveigne, A.D.³

32
- 84897862522
- Parameter generation methods with rich context modelsfor high-quality and flexible text-to-speech synthesis
- S. Takamichi, T. Toda, Y. Shiga, S. Sakti, G. Neubig, and S. Nakamura, "Parameter generation methods with rich context modelsfor high-quality and flexible text-to-speech synthesis, " IEEE Journalof Selected Topics in Signal Processing, vol. 8, no. 2, pp. 239-250, 2014.
- (2014) IEEE Journalof Selected Topics in Signal Processing , vol.8 , Issue.2 , pp. 239-250
- Takamichi, S.¹ Toda, T.² Shiga, Y.³ Sakti, S.⁴ Neubig, G.⁵ Nakamura, S.⁶

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.