SCOPUS 정보 검색 플랫폼

IEEE Journal on Selected Topics in Signal Processing

Volumn 8, Issue 2, 2014, Pages 173-183

Statistical parametric speech synthesis based on gaussian process regression

(3) Koriyama, Tomoki a Nose, Takashi a Kobayashi, Takao a

Author keywords

Gaussian process regression; nonparametric Bayesian model; partially independent conditional (PIC) approximation; sparse Gaussian processes; statistical speech synthesis

Indexed keywords

ARTICULATORY INFORMATIONS; GAUSSIAN PROCESS REGRESSION; LINGUISTIC INFORMATION; MINIMUM GENERATION ERRORS; NON-PARAMETRIC BAYESIAN MODELING; PARTIALLY INDEPENDENT CONDITIONAL (PIC) APPROXIMATION; SPARSE GAUSSIAN PROCESS; STATISTICAL PARAMETRIC SPEECH SYNTHESIS;

BAYESIAN NETWORKS; GAUSSIAN NOISE (ELECTRONIC); HIDDEN MARKOV MODELS; REGRESSION ANALYSIS; SPEECH SYNTHESIS; TELEPHONE SETS;

GAUSSIAN DISTRIBUTION;

EID: 84897902941 PISSN: 19324553 EISSN: None Source Type: Journal
DOI: 10.1109/JSTSP.2013.2283461 Document Type: Article

Times cited : (37)

References (36)

1
- 85009139544
- Simultaneous modeling of spectrum, pitch and duration in HMM-based speech synthesis
- T. Yoshimura, K. Tokuda, T. Masuko, T. Kobayashi, and T. Kitamura, "Simultaneous modeling of spectrum, pitch and duration in HMM-based speech synthesis," in Proc. EUROSPEECH, 1999, pp. 2347-2350
- (1999) Proc. EUROSPEECH , pp. 2347-2350
- Yoshimura, T.¹ Tokuda, K.² Masuko, T.³ Kobayashi, T.⁴ Kitamura, T.⁵

2
- 67651002140
- Statistical parametric speech synthesis
- H. Zen, K. Tokuda, and A. Black, "Statistical parametric speech synthesis," Speech Commun., vol. 51, no. 11, pp. 1039-1064, 2009
- (2009) Speech Commun , vol.51 , Issue.11 , pp. 1039-1064
- Zen, H.¹ Tokuda, K.² Black, A.³

3
- 0028996993
- Speech parameter generation from HMM using dynamic features
- K. Tokuda, T. Kobayashi, and S. Imai, "Speech parameter generation from HMM using dynamic features," in Proc. ICASSP '95, 1995, pp. 660-663
- (1995) Proc. ICASSP , vol.95 , pp. 660-663
- Tokuda, K.¹ Kobayashi, T.² Imai, S.³

4
- 0003805597
- The use of context in large vocabulary speech recognition
- Cambridge, U.K
- J. J.Odell, "The use of context in large vocabulary speech recognition," Ph.D. dissertation, Univ. of Cambridge, Cambridge, U.K., 1995
- (1995) Ph.D. Dissertation Univ. of Cambridge
- Odell, J.J.¹

5
- 33846429403
- Minimum generation error training for HMM-based speech synthesis
- Y. J. Wu and R. H. Wang, "Minimum generation error training for HMM-based speech synthesis," in Proc. ICASSP, 2006, vol. 1, pp. 889-892
- (2006) Proc. ICASSP , vol.1 , pp. 889-892
- Wu, Y.J.¹ Wang, R.H.²

6
- 33749573927
- Reformulating the HMM as a trajectory model by imposing explicit relationships between static and dynamic feature vector sequences
- DOI 10.1016/j.csl.2006.01.002, PII S0885230806000052
- H. Zen, K. Tokuda, and T. Kitamura, "Reformulating the HMM as a trajectory model by imposing explicit relationships between static and dynamic feature vector sequences," Comput. Speech Lang., vol. 21, no. 1, pp. 153-173, 2007 (Pubitemid 44537647)
- (2007) Computer Speech and Language , vol.21 , Issue.1 , pp. 153-173
- Zen, H.¹ Tokuda, K.² Kitamura, T.³

7
- 70450175584
- Autoregressive HMMs for speech synthesis
- M. Shannon and W. Byrne, "Autoregressive HMMs for speech synthesis," in Proc. Interspeech, 2009, vol. 2009, pp. 400-403
- (2009) Proc. Interspeech , vol.2009 , pp. 400-403
- Shannon, M.¹ Byrne, W.²

8
- 34547514452
- A novel HMM-based TTS system using both continuous HMMs and discrete HMMs
- J. Yu, M. Zhang, J. Tao, and X. Wang, "A novel HMM-based TTS system using both continuous HMMs and discrete HMMs," in Proc. ICASSP, 2007, pp. 709-712
- (2007) Proc. ICASSP , pp. 709-712
- Yu, J.¹ Zhang, M.² Tao, J.³ Wang, X.⁴

9
- 70450161678
- Rich context modeling for high quality HMM-based TTS
- Z.-J. Yan, Y. Qian, and F. K. Soong, "Rich context modeling for high quality HMM-based TTS," in Proc. INTERSPEECH, 2009, pp. 1755-1758
- (2009) Proc. INTERSPEECH , pp. 1755-1758
- Yan, Z.-J.¹ Qian, Y.² Soong, F.K.³

10
- 56349089638
- Gaussian process regression for voice activity detection and speech enhancement
- S. Park and S. Choi, "Gaussian process regression for voice activity detection and speech enhancement," in Proc. IJCNN, 2008, pp. 2879-2882
- (2008) Proc. IJCNN , pp. 2879-2882
- Park, S.¹ Choi, S.²

11
- 84865737668
- Gaussian process experts for voice conversion
- N. C. V. Pilkington, H. Zen, and M. J. F. Gales, "Gaussian process experts for voice conversion," in Proc. INTERSPEECH, 2011, pp. 2761-2764
- (2011) Proc. INTERSPEECH , pp. 2761-2764
- Pilkington, N.C.V.¹ Zen, H.² Gales, M.J.F.³

12
- 84897905494
- Gaussian process dynamical models for phoneme classification
- H. Park and C. D. Yoo, "Gaussian process dynamical models for phoneme classification," in Proc. NIPS Workshop Bayesian Nonparametrics: Hope or Hype, 2011
- (2011) Proc. NIPS Workshop Bayesian Nonparametrics: Hope or Hype
- Park, H.¹ Yoo, C.D.²

13
- 84867596846
- Gaussian process dynamical models for nonparametric speech representation and synthesis
- G. Henter, M. Frean, and W. Kleijn, "Gaussian process dynamical models for nonparametric speech representation and synthesis," in Proc. ICASSP, 2012, pp. 4505-4508
- (2012) Proc. ICASSP , pp. 4505-4508
- Henter, G.¹ Frean, M.² Kleijn, W.³

14
- 25444448065
- Cambridge, MA, USA: MIT press
- C. E. Rasmussen and C. K. I. Williams, Gaussian Processes for Machine Learning. Cambridge, MA, USA: MIT press, 2006
- (2006) Gaussian Processes for Machine Learning
- Rasmussen, C.E.¹ Williams, C.K.I.²

15
- 14644392676
- Cambridge, U.K.: Cambridge Univ. Press
- J. Shawe-Taylor and N. Cristianini, Kernel Methods for Pattern Analysis. Cambridge, U.K.: Cambridge Univ. Press, 2004
- (2004) Kernel Methods for Pattern Analysis
- Shawe-Taylor, J.¹ Cristianini, N.²

16
- 78649536452
- A frame-based context-dependent acoustic modeling for speech recognition (Japanese)
- R. Terashima, H. Zen, Y. Nankaku, and K. Tokuda, "A frame-based context-dependent acoustic modeling for speech recognition (Japanese)," IEEJ Trans. Electron., Inf., Syst., vol. 130, no. 10, pp. 1856-1864, 2010
- (2010) IEEJ Trans. Electron., Inf., Syst , vol.130 , Issue.10 , pp. 1856-1864
- Terashima, R.¹ Zen, H.² Nankaku, Y.³ Tokuda, K.⁴

17
- 84867599581
- An F0 modeling technique based on prosodic events for spontaneous speech synthesis
- T. Koriyama, T. Nose, and T. Kobayashi, "An F0 modeling technique based on prosodic events for spontaneous speech synthesis," in Proc. ICASSP, 2012, pp. 4589-4593
- (2012) Proc. ICASSP , pp. 4589-4593
- Koriyama, T.¹ Nose, T.² Kobayashi, T.³

18
- 84890490547
- Statistical parametric speech synthesis using deep neural networks
- H. Zen, A. Senior, and M. Schuster, "Statistical parametric speech synthesis using deep neural networks," in Proc. ICASSP, 2013, pp. 7962-7966
- (2013) Proc. ICASSP , pp. 7962-7966
- Zen, H.¹ Senior, A.² Schuster, M.³

19
- 2642588255
- Orthogonalized distinctive phonetic feature extraction for noise-robust automatic speech recognition
- T. Fukuda and T. Nitta, "Orthogonalized distinctive phonetic feature extraction for noise-robust automatic speech recognition," IEICE Trans. Inf. Syst., vol. 87, no. 5, pp. 1110-1118, 2004
- (2004) IEICE Trans. Inf. Syst , vol.87 , Issue.5 , pp. 1110-1118
- Fukuda, T.¹ Nitta, T.²

20
- 0025475528
- ATR Japanese speech database as a tool of speech recognition and synthesis
- Aug
- A. Kurematsu, K. Takeda, Y. Sagisaka, S. Katagiri, H. Kuwabara, and K. Shikano, "ATR Japanese speech database as a tool of speech recognition and synthesis," Speech Commun., vol. 9, no. 4, pp. 357-363, Aug. 1990
- (1990) Speech Commun , vol.9 , Issue.4 , pp. 357-363
- Kurematsu, A.¹ Takeda, K.² Sagisaka, Y.³ Katagiri, S.⁴ Kuwabara, H.⁵ Shikano, K.⁶

21
- 0032673049
- Restructuring speech representations using a pitch-adaptive time-frequency smoothing and an instantaneous-frequency-based F0 extraction: Possible role of a repetitive structure in sounds
- H. Kawahara, I. Masuda-Katsuse, and A. de Cheveigne, "Restructuring speech representations using a pitch-adaptive time-frequency smoothing and an instantaneous-frequency-based F0 extraction: Possible role of a repetitive structure in sounds," Speech Commun., vol. 27, no. 3-4, pp. 187-207, 1999
- (1999) Speech Commun , vol.27 , Issue.3-4 , pp. 187-207
- Kawahara, H.¹ Masuda-Katsuse, I.² De Cheveigne, A.³

22
- 44449177634
- A hidden semi-Markovmodel-based speech synthesis
- H. Zen, K. Tokuda, T. Masuko, T. Kobayashi, and T. Kitamura, "A hidden semi-Markovmodel-based speech synthesis," IEICE Trans. Inf. Syst., vol. 90, no. 5, pp. 825-834, 2007
- (2007) IEICE Trans. Inf. Syst , vol.90 , Issue.5 , pp. 825-834
- Zen, H.¹ Tokuda, K.² Masuko, T.³ Kobayashi, T.⁴ Kitamura, T.⁵

23
- 0033906251
- MDL-based context-dependent subword modeling for speech recognition
- K. Shinoda and T.Watanabe, "MDL-based context-dependent subword modeling for speech recognition," Acoust. Sci. Technol., vol. 21, no. 2, pp. 79-86, 2000 (Pubitemid 30594111)
- (2000) Journal of the Acoustical Society of Japan (E) (English translation of Nippon Onkyo Gakkaishi) , vol.21 , Issue.2 , pp. 79-86
- Shinoda Koichi¹ Watanabe Takao²

24
- 0027247004
- Mel-Cepstral distance measure for objective speech quality assessment
- R. Kubichek, "Mel-cepstral distance measure for objective speech quality assessment," in Proc. IEEE Pacific Rim Conf. Commun., Comput., Signal Process., 1993, vol. 1, pp. 125-128 (Pubitemid 23713438)
- (1993) IEEE Pac Rim Conf Commun Comput Signal Process , pp. 125-128
- Kubichek Robert, F.¹

25
- 0004161441
- NewYork,NY,USA: Springer
- H. Wackernagel, Multivariate Geostatistics. NewYork,NY,USA: Springer, 2003
- (2003) Multivariate Geostatistics
- Wackernagel, H.¹

26
- 79960113740
- Local and global sparse Gaussian process approximations
- E. Snelson and Z. Ghahramani, "Local and global sparse Gaussian process approximations," in Proc. AISTATS, 2007
- (2007) Proc. AISTATS
- Snelson, E.¹ Ghahramani, Z.²

27
- 29144453489
- A unifying view of sparse approximate Gaussian process regression
- J. Qui nonero-Candela and C. E. Rasmussen, "A unifying view of sparse approximate Gaussian process regression," J. Mach. Learn. Res., vol. 6, pp. 1939-1959, 2005 (Pubitemid 41798128)
- (2005) Journal of Machine Learning Research , vol.6 , pp. 1939-1959
- Quinonero-Candela, J.¹ Rasmussen, C.E.²

28
- 84864038646
- Sparse Gaussian processes using pseudo-inputs
- MIT Press
- E. Snelson and Z. Ghahramani, "Sparse Gaussian processes using pseudo-inputs," in Proc. NIPS 18,MIT Press, 2006, pp. 1257C-1264
- (2006) Proc. NIPS , vol.18
- Snelson, E.¹ Ghahramani, Z.²

29
- 0003474751
- Cambridge, U.K.: Cambridge Univ. Press
- W. H. Press, S. A. Teukolsky, W. T. Vetterling, and B. P. Flannery, Numerical Recipes in C: The Art of Scientific Computing. Cambridge, U.K.: Cambridge Univ. Press, 1992
- (1992) Numerical Recipes in C: The Art of Scientific Computing
- Press, W.H.¹ Teukolsky, S.A.² Vetterling, W.T.³ Flannery, B.P.⁴

30
- 0004019973
- Univ. of California, Santa Cruz, CA, USA, Tech. Rep. UCSC-CRL-
- D. Haussler, Convolution Kernels on Discrete Structures Dept. of Comput. Sci., Univ. of California, Santa Cruz, CA, USA, Tech. Rep. UCSC-CRL-99-10, 1999
- (1999) Convolution Kernels on Discrete Structures Dept. of Comput. Sci , pp. 99-10
- Haussler, D.¹

31
- 38549096029
- A speech parameter generation algorithm considering global variance for HMM-based speech synthesis
- T. Toda and K. Tokuda, "A speech parameter generation algorithm considering global variance for HMM-based speech synthesis," IEICE Trans. Inf. Syst., vol. E90-D, no. 5, pp. 816-824, 2007
- (2007) IEICE Trans. Inf. Syst. , vol.90 , Issue.5 , pp. 816-824
- Toda, T.¹ Tokuda, K.²

32
- 85009097254
- Mixed excitation for HMM-based speech synthesis
- T. Yoshimura, K. Tokuda, T. Masuko, T. Kobayashi, and T. Kitamura, "Mixed excitation for HMM-based speech synthesis," in Proc. Eurospeech, 2001, pp. 2263-2266
- (2001) Proc. Eurospeech , pp. 2263-2266
- Yoshimura, T.¹ Tokuda, K.² Masuko, T.³ Kobayashi, T.⁴ Kitamura, T.⁵

33
- 67650851754
- USTC system for Blizzard Challenge 2006 an improved HMM-based speech synthesis method
- Z.-H. Ling, Y.-J. Wu, Y.-P. Wang, L. Qin, and R.-H. Wang, "USTC system for Blizzard Challenge 2006 an improved HMM-based speech synthesis method," in Proc. Blizzard Challenge Workshop, 2006
- (2006) Proc. Blizzard Challenge Workshop
- Ling, Z.-H.¹ Wu, Y.-J.² Wang, Y.-P.³ Qin, L.⁴ Wang, R.-H.⁵

34
- 84865801900
- The effect of using normalized models in statistical speech synthesis
- M. Shannon, H. Zen, and W. Byrne, "The effect of using normalized models in statistical speech synthesis," in Proc. Interspeech, 2011, 2011, pp. 121-124
- (2011) Proc. Interspeech 2011 , pp. 121-124
- Shannon, M.¹ Zen, H.² Byrne, W.³

35
- 84862860004
- Nonstationary dependent Gaussian processes for data fusion in large-scale terrain modeling
- S. Vasudevan, F. Ramos, E. Nettleton, and H. Durrant-Whyte, "Nonstationary dependent Gaussian processes for data fusion in large-scale terrain modeling," in Proc. ICRA, 2011, pp. 1875-1882
- (2011) Proc. ICRA , pp. 1875-1882
- Vasudevan, S.¹ Ramos, F.² Nettleton, E.³ Durrant-Whyte, H.⁴

36
- 84898943255
- Warped Gaussian processes
- MIT Press
- E. Snelson, C. E. Rasmussen, and Z. Ghahramani, "Warped Gaussian processes," in Proc. NIPS 16,MIT Press, 2004, pp. 337-344.
- (2004) Proc. NIPS , vol.16 , pp. 337-344
- Snelson, E.¹ Rasmussen, C.E.² Ghahramani, Z.³

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.