SCOPUS 정보 검색 플랫폼

ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings

Volumn 2015-August, Issue , 2015, Pages 4455-4459

The effect of neural networks in statistical parametric speech synthesis

(4) Hashimoto, Kei a Oura, Keiichiro a Nankaku, Yoshihiko a Tokuda, Keiichi a

a NAGOYA INSTITUTE OF TECHNOLOGY (Japan)

Author keywords

deep neural network; hidden Markov model; Statistical parametric speech synthesis

Indexed keywords

DEEP NEURAL NETWORKS; HIDDEN MARKOV MODELS; SPEECH COMMUNICATION; SPEECH SYNTHESIS; STATISTICS;

ACOUSTIC MODEL; GENERATION PROCESS; GENERATIVE MODEL; STATISTICAL PARAMETRIC SPEECH SYNTHESIS; SYNTHESIZED SPEECH;

AUDIO SIGNAL PROCESSING;

EID: 84946074523 PISSN: 15206149 EISSN: None Source Type: Conference Proceeding
DOI: 10.1109/ICASSP.2015.7178813 Document Type: Conference Paper

Times cited : (46)

References (21)

1
- 84876687945
- Speech synthesis based on hidden Markov models
- K. Tokuda, Y. Nankaku, T. Toda, H. Zen, J. Yamagishi, and K. Oura, "Speech synthesis based on hidden Markov models;' Proceedings of the IEEE, vol. 101, no. 5, pp. 1234-1252, 2013
- (2013) Proceedings of the IEEE , vol.101 , Issue.5 , pp. 1234-1252
- Tokuda, K.¹ Nankaku, Y.² Toda, T.³ Zen, H.⁴ Yamagishi, J.⁵ Oura, K.⁶

2
- 0029765811
- Unit selection in a concatenative speech synthesis system using a large speech database
- A. Hunt and A.W. Black, "Unit selection in a concatenative speech synthesis system using a large speech database, " Proceedings of ICASSP 1996, pp. 373-376, 1996
- (1996) Proceedings of ICASSP 1996 , pp. 373-376
- Hunt, A.¹ Black, A.W.²

3
- 0034842740
- Adaptation of pitch and spectrum for HMM-based speech synthesis using mllr
- M. Tamura, T. Masuko, K. Tokuda, and T. Kobayashi, "Adaptation of pitch and spectrum for HMM-based speech synthesis using mllr, " Proceedings of ICASSP 2001, pp. 805-808, 2001
- (2001) Proceedings of ICASSP 2001 , pp. 805-808
- Tamura, M.¹ Masuko, T.² Tokuda, K.³ Kobayashi, T.⁴

4
- 85135145847
- Speaker interpolation in HMM-based speech synthesis system
- T. Yoshimura, K. Tokuda, T. Masuko, T. Kobayashi, and T. Kitamura, "Speaker interpolation in HMM-based speech synthesis system, " Proceedings of Eurospeech 1997, pp. 2523-2526, 1997
- (1997) Proceedings of Eurospeech 1997 , pp. 2523-2526
- Yoshimura, T.¹ Tokuda, K.² Masuko, T.³ Kobayashi, T.⁴ Kitamura, T.⁵

5
- 85009257840
- Eigenvoices for HMM-based speech synthesis
- K. Shichiri, A. Sawabe, K. Tokuda, T. Masuko, T. Kobayashi, and T. Kitamura, "Eigenvoices for HMM-based speech synthesis, " Proceedings of ICSLP 2002, pp. 1269-1272, 2002
- (2002) Proceedings of ICSLP 2002 , pp. 1269-1272
- Shichiri, K.¹ Sawabe, A.² Tokuda, K.³ Masuko, T.⁴ Kobayashi, T.⁵ Kitamura, T.⁶

6
- 33847129573
- Average-voice-based speech synthesis using HSMM-based speaker adaptation and adaptive training
- J. Yamagishi and T. Kobayashi, "Average-voice-based speech synthesis using HSMM-based speaker adaptation and adaptive training, " IEICE Transactions on Information &Systems, vol. E90-D, no. 2, pp. 533-543, 2007
- (2007) IEICE Transactions on Information &Systems , vol.E90-D , Issue.2 , pp. 533-543
- Yamagishi, J.¹ Kobayashi, T.²

7
- 33846935000
- HMM-based Korean speech synthesis system for hand-held devices
- S.J. Kim, J.J. Kim, and M.S. Hahn, "HMM-based Korean speech synthesis system for hand-held devices, " IEEE Trans. Consum. Electron., vol. 52, no. 4, pp. 1384-1390, 2006
- (2006) IEEE Trans. Consum. Electron , vol.52 , Issue.4 , pp. 1384-1390
- Kim, S.J.¹ Kim, J.J.² Hahn, M.S.³

8
- 79959839868
- Quantized HMMs for low footprint text-to-speech synthesis
- A. Gutkin, X. Gonzalvo, S. Breuer, and P. Taylor, "Quantized HMMs for low footprint text-to-speech synthesis, " Proceedings of Interspeech 2010, pp. 837-840, 2010
- (2010) Proceedings of Interspeech 2010 , pp. 837-840
- Gutkin, A.¹ Gonzalvo, X.² Breuer, S.³ Taylor, P.⁴

9
- 85009139544
- Simultaneous modeling of spectrum, pitch and duration in HMM-based speech synthesis
- T. Yoshimura, K. Tokuda, T. Masuko, T. Kobayashi, and T. Kitamura, "Simultaneous modeling of spectrum, pitch and duration in HMM-based speech synthesis, " Proceedings of Eurospeech 1999, pp. 2347-2350, 1999
- (1999) Proceedings of Eurospeech 1999 , pp. 2347-2350
- Yoshimura, T.¹ Tokuda, K.² Masuko, T.³ Kobayashi, T.⁴ Kitamura, T.⁵

10
- 85032751458
- Deep neural networks for acoustic modeling in speech recognition
- G. Hinton, L. Deng, D. Yu, G. Dahl, A. Mohamed, N. Jaitly, A. Senior, V. Vanhoucke, P. Nguyen, T. Sainath, and B. Kingsbury, "Deep neural networks for acoustic modeling in speech recognition, " IEEE Signal Processing Magazine, vol. 29, no. 6, pp. 82-97, 2012
- (2012) IEEE Signal Processing Magazine , vol.29 , Issue.6 , pp. 82-97
- Hinton, G.¹ Deng, L.² Yu, D.³ Dahl, G.⁴ Mohamed, A.⁵ Jaitly, N.⁶ Senior, A.⁷ Vanhoucke, V.⁸ Nguyen, P.⁹ Sainath, T.¹⁰ Kingsbury, B.¹¹

11
- 84890490547
- Statistical parametric speech synthesis using deep neural networks
- H. Zen, A. Senior, and M. Schuster, "Statistical parametric speech synthesis using deep neural networks, " Proceedings of ICASSP 2013, pp. 7962-7966, 2013
- (2013) Proceedings of ICASSP 2013 , pp. 7962-7966
- Zen, H.¹ Senior, A.² Schuster, M.³

12
- 84929157442
- Combining a vector space representation of linguistic context with a deep neural network for text-to-speech synthesis
- H. Lu, S. King, and O. Watts, "Combining a vector space representation of linguistic context with a deep neural network for text-to-speech synthesis;' Proceedings of ISCA SSW8, pp. 281-285, 2013
- (2013) Proceedings of ISCA SSW8 , pp. 281-285
- Lu, H.¹ King, S.² Watts, O.³

13
- 84905251808
- On the training aspects of deep neural network (DNN) for parametric TTS synthesis
- Y. Qian, Y. Fan, H. Wenping, and EK. Soong, "On the training aspects of deep neural network (DNN) for parametric TTS synthesis, " Proceedings of ICASSP 2014, pp. 3857-3861, 2014
- (2014) Proceedings of ICASSP 2014 , pp. 3857-3861
- Qian, Y.¹ Fan, Y.² Wenping, H.³ Soong, E.K.⁴

14
- 0002144369
- Tree-based state tying for high accuracy acoustic modelling
- S. Young, J.J. Odell, and P. Woodland, 'Tree-based state tying for high accuracy acoustic modelling, " Proceedings of ARPA Workshop on Human Language Technology, pp. 307-312, 1994
- (1994) Proceedings of ARPA Workshop on Human Language Technology , pp. 307-312
- Young, S.¹ Odell, J.J.² Woodland, P.³

15
- 0033708106
- Speech parameter generation algorithms for HMMbased speech synthesis
- K. Tokuda, T. Yoshimura, T. Masuko, T. Kobayashi, and T. Kitamura, "Speech parameter generation algorithms for HMMbased speech synthesis, " Proceedings of ICASSP 2000, pp. 936-939, 2000
- (2000) Proceedings of ICASSP 2000 , pp. 936-939
- Tokuda, K.¹ Yoshimura, T.² Masuko, T.³ Kobayashi, T.⁴ Kitamura, T.⁵

16
- 0025475528
- ATR Japanese speech database as a tool of speech recognition and synthesis
- A. Kurematsu, K. Takeda, Y. Sagisaka, S. Katagiri, H. Kuwabara, and K. Shikano, "ATR Japanese speech database as a tool of speech recognition and synthesis, " Speech Communication, vol. 9, pp. 357-363, 1990
- (1990) Speech Communication , vol.9 , pp. 357-363
- Kurematsu, A.¹ Takeda, K.² Sagisaka, Y.³ Katagiri, S.⁴ Kuwabara, H.⁵ Shikano, K.⁶

17
- 0032673049
- Restructuring speech representations using a pitch-adaptive timefrequency smoothing and an instantaneous-frequency-based FO extraction: Possible role of a repetitive structure in sounds
- H. Kawahara, I. Masuda-Katsuse, and A. Cheveigne, "Restructuring speech representations using a pitch-adaptive timefrequency smoothing and an instantaneous-frequency-based FO extraction: Possible role of a repetitive structure in sounds, " Speech Communication, vol. 27, pp. 187-207, 1999
- (1999) Speech Communication , vol.27 , pp. 187-207
- Kawahara, H.¹ Masuda-Katsuse, I.² Cheveigne, A.³

18
- 85135145174
- Acoustic modeling based on the MDL criterion for speech recognition
- K. Shinoda and T. Watanabe, "Acoustic modeling based on the MDL criterion for speech recognition, " Proceedings of Eurospeech 1997, pp. 99-102, 1997
- (1997) Proceedings of Eurospeech 1997 , pp. 99-102
- Shinoda, K.¹ Watanabe, T.²

19
- 84910100893
- DNN-based stochastic postfilter for HMM-based speech synthesis
- L.H. Chen, T. Raitio, C. Valentini-Botinhao, J. Yamagishi, and Z.H. Ling, "DNN-based stochastic postfilter for HMM-based speech synthesis, " Proceedings of Interspeech 2014, pp. 1954-1958, 2014
- (2014) Proceedings of Interspeech 2014 , pp. 1954-1958
- Chen, L.H.¹ Raitio, T.² Valentini-Botinhao, C.³ Yamagishi, J.⁴ Ling, Z.H.⁵

20
- 84910047819
- TTS synthesis with bidirectional LSTM based recurrent neural networks
- Y. Fan, Y. Qian, EL. Xie, EK. Soong, and H. Li, "TTS synthesis with bidirectional LSTM based recurrent neural networks;' Proceedings of Interspeech 2014, pp. 1964-1968, 2014
- (2014) Proceedings of Interspeech 2014 , pp. 1964-1968
- Fan, Y.¹ Qian, Y.² Xie, E.L.³ Soong, E.K.⁴ Li, H.⁵

21
- 38549096029
- A speech parameter generation algorithm considering global variance for HMM-based speech synthesis
- T. Toda and K. Tokuda, "A speech parameter generation algorithm considering global variance for HMM-based speech synthesis, " IEICE Transactions on Information &Systems, vol. E90-D, no. 5, pp. 816-824, 2007
- (2007) IEICE Transactions on Information &Systems , vol.E90-D , Issue.5 , pp. 816-824
- Toda, T.¹ Tokuda, K.²

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.