SCOPUS 정보 검색 플랫폼

Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH

Volumn 2017-August, Issue , 2017, Pages 4011-4015

Siri on-device deep learning-guided unit selection text-To-speech system

(18) Capes, Tim a Coles, Paul a Conkie, Alistair a Golipour, Ladan a Hadjitarkhani, Abie a Hu, Qiong a Huddleston, Nancy a Hunt, Melvyn a Li, Jiangchuan a Neeracher, Matthias a Prahallad, Kishore a Raitio, Tuomo a Rasipuram, Ramya a Townsend, Greg a Williamson, Becci a Winarsky, David a Wu, Zhizheng a Zhang, Hepeng a

a APPLE INC (United States)

Author keywords

Hybrid; Recurrent mixture density network, on device; Speech synthesis; Unit selection

Indexed keywords

DEEP LEARNING; MIXTURES; SPEECH; SPEECH SYNTHESIS;

DEVICE CAPABILITIES; HYBRID; LEARNING TECHNIQUES; MIXTURE DENSITY; TEXT-TO-SPEECH SYSTEM; UNIT SELECTION; UNIT-SELECTION SPEECH SYNTHESIS; VOICE BUILDING PROCESS;

SPEECH COMMUNICATION;

EID: 85039170210 PISSN: 2308457X EISSN: 19909772 Source Type: Conference Proceeding
DOI: 10.21437/Interspeech.2017-1798 Document Type: Conference Paper

Times cited : (75)

References (17)

1
- 67651002140
- Statistical parametric speech syn thesis
- H. Zen, K. Tokuda, and A. W. Black, "Statistical parametric speech synthesis, " Speech Communication, vol. 51, no. 11, pp. 1039-1064, 2009.
- (2009) Speech Communication , vol.51 , Issue.11 , pp. 1039-1064
- Zen, H.¹ Tokuda, K.² Black, A.W.³

2
- 0029765811
- Unit selection in a concatenative speech synthesis system using a large speech database
- A. J. Hunt and A. W. Black, "Unit selection in a concatenative speech synthesis system using a large speech database, " in IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), vol. 1, 1996, pp. 373-376.
- (1996) IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP) , vol.1 , pp. 373-376
- Hunt, A.J.¹ Black, A.W.²

3
- 85015421291
- Unit size in unit selection speech synthesis
- S. P. Kishore and A. W. Black, "Unit size in unit selection speech synthesis." in Interspeech, 2003.
- (2003) Interspeech
- Kishore, S.P.¹ Black, A.W.²

4
- 67650816595
- The USTC and iflytek speech synthesis systems for Blizzard Challenge 2007
- Z.-H. Ling, L. Qin, H. Lu, Y. Gao, L.-R. Dai, R.-H. Wang, Y. Jiang, Z.-W. Zhao, J.-H. Yang, J. Chen et al., "The USTC and iflytek speech synthesis systems for Blizzard Challenge 2007, " in Blizzard Challenge Workshop, 2007.
- (2007) Blizzard Challenge Workshop
- Ling, Z.-H.¹ Qin, L.² Lu, H.³ Gao, Y.⁴ Dai, L.-R.⁵ Wang, R.-H.⁶ Jiang, Y.⁷ Zhao, Z.-W.⁸ Yang, J.-H.⁹ Chen, J.¹⁰

5
- 84959124410
- Using deep bidirectional recurrent neural networks for prosodictarget prediction in a unit-selection text-To-speech system
- R. Fernandez, A. Rendel, B. Ramabhadran, and R. Hoory, "Using deep bidirectional recurrent neural networks for prosodictarget prediction in a unit-selection text-To-speech system." in Interspeech, 2015, pp. 1606-1610.
- (2015) Interspeech , pp. 1606-1610
- Fernandez, R.¹ Rendel, A.² Ramabhadran, B.³ Hoory, R.⁴

6
- 84973402504
- Deep neural network-guided unit selection syn thesis
- T. Merritt, R. A. Clark, Z. Wu, J. Yamagishi, and S. King, "Deep neural network-guided unit selection synthesis, " in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2016, pp. 5145-5149.
- (2016) IEEE International Conference on Acoustics Speech and Signal Processing (ICASSP) , pp. 5145-5149
- Merritt, T.¹ Clark, R.A.² Wu, Z.³ Yamagishi, J.⁴ King, S.⁵

7
- 85039149607
- The ustc system for blizzard challenge 2016
- L.-H. Chen, Y. Jiang, M. Zhou, Z.-H. Ling, and L.-R. Dai, "The USTC system for Blizzard Challenge 2016, " in Blizzard Challenge Workshop, 2016.
- (2016) Blizzard Challenge Workshop
- Chen, L.-H.¹ Jiang, Y.² Zhou, M.³ Ling, Z.-H.⁴ Dai, L.-R.⁵

8
- 34047123652
- Multisyn: Open-domain unit selection for the festival speech synthesis system
- R. A. Clark, K. Richmond, and S. King, "Multisyn: Open-domain unit selection for the festival speech synthesis system, " Speech Communication, vol. 49, no. 4, pp. 317-330, 2007.
- (2007) Speech Communication , vol.49 , Issue.4 , pp. 317-330
- Clark, R.A.¹ Richmond, K.² King, S.³

9
- 84994309294
- Recent advances in google real-Time HMM-driven unit selection synthesizer
- X. Gonzalvo, S. Tazari, C.-A. Chan, M. Becker, A. Gutkin, and H. Silen, "Recent advances in google real-Time HMM-driven unit selection synthesizer, " in Interspeech, 2016, pp. 2238-2242.
- (2016) Interspeech , pp. 2238-2242
- Gonzalvo, X.¹ Tazari, S.² Chan, C.-A.³ Becker, M.⁴ Gutkin, A.⁵ Silen, H.⁶

10
- 0004113976
- Mixture density networks
- C. Bishop, "Mixture density networks, " Tech. Rep. NCRG/94/004, Neural Computing Research Group, Aston University, 1994.
- (1994) Tech. Rep. NCRG/94/004, Neural Computing Research Group, Aston University
- Bishop, C.¹

11
- 84905262874
- Deep mixture density networks for acoustic modeling in statistical parametric speech syn thesis
- H. Zen and A. Senior, "Deep mixture density networks for acoustic modeling in statistical parametric speech synthesis, " in IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2014, pp. 3872-3876.
- (2014) IEEE International Conference on Acoustics Speech, and Signal Processing (ICASSP) , pp. 3872-3876
- Zen, H.¹ Senior, A.²

12
- 84939821078
- arXiv preprint arXiv 1412.3555
- J. Chung, C. Gulcehre, K. Cho, and Y. Bengio, "Empirical evaluation of gated recurrent neural networks on sequence modeling, " arXiv preprint arXiv:1412.3555, 2014.
- (2014) Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling
- Chung, J.¹ Gulcehre, C.² Cho, K.³ Bengio, Y.⁴

13
- 84973355618
- Investigating gated recurrent networks for speech syn thesis
- Z. Wu and S. King, "Investigating gated recurrent networks for speech synthesis, " in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2016, pp. 5140-5144.
- (2016) IEEE International Conference on Acoustics Speech and Signal Processing (ICASSP). IEEE , pp. 5140-5144
- Wu, Z.¹ King, S.²

14
- 84946033275
- Deep neural networks employing multi-Task learning and stacked bottleneck features for speech syn thesis
- Z. Wu, C. Valentini-Botinhao, O. Watts, and S. King, "Deep neural networks employing multi-Task learning and stacked bottleneck features for speech synthesis, " in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2015, pp. 4460-4464.
- (2015) IEEE International Conference on Acoustics Speech and Signal Processing (ICASSP) , pp. 4460-4464
- Wu, Z.¹ Valentini-Botinhao, C.² Watts, O.³ King, S.⁴

15
- 34047268342
- Conversational speech synthesis and the need for some laughter
- N. Campbell, "Conversational speech synthesis and the need for some laughter, " IEEE Transactions on Audio, Speech, and Language Processing, vol. 14, no. 4, pp. 1171-1178, 2006.
- (2006) IEEE Transactions on Audio, Speech, and Language Processing , vol.14 , Issue.4 , pp. 1171-1178
- Campbell, N.¹

16
- 84959121380
- Pruning redundant synthesis units based on static and delta unit appearance frequency
- H. Lu, W. Zhang, X. Shao, Q. Zhou, W. Lei, H. Zhou, and A. Breen, "Pruning redundant synthesis units based on static and delta unit appearance frequency, " in Interspeech, 2015.
- (2015) Interspeech
- Lu, H.¹ Zhang, W.² Shao, X.³ Zhou, Q.⁴ Lei, W.⁵ Zhou, H.⁶ Breen, A.⁷

17
- 79551478696
- The Romanian speech synthesis (RSS) corpus: Building a high quality HMMbased speech synthesis system using a high sampling rate
- A. Stan, J. Yamagishi, S. King, and M. Aylett, "The Romanian speech synthesis (RSS) corpus: Building a high quality HMMbased speech synthesis system using a high sampling rate, " Speech Communication, vol. 53, no. 3, pp. 442-450, 2011.
- (2011) Speech Communication , vol.53 , Issue.3 , pp. 442-450
- Stan, A.¹ Yamagishi, J.² King, S.³ Aylett, M.⁴

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.