-
1
-
-
84958264664
-
-
Abadi, Martín, Agarwal, Ashish, Barham, Paul, Brevdo, Eugene, Chen, Zhifeng, Citro, Craig, Corrado, Greg S., Davis, Andy, Dean, Jeffrey, Devin, Matthieu, Ghemawat, Sanjay, Goodfellow, Ian, Harp, Andrew, Irving, Geoffrey, Isard, Michael, Jia, Yangqing, Jozefowicz, Rafal, Kaiser, Lukasz, Kudlur, Manjunath, Levenberg, Josh, Mané, Dan, Monga, Rajat, Moore, Sherry, Murray, Derek, Olah, Chris, Schuster, Mike, Shlens, Jonathon, Steiner, Benoit, Sutskever, Ilya, Talwar, Kunal, Tucker, Paul, Vanhoucke, Vincent, Vasudevan, Vijay, Viégas, Fernanda, Vinyals, Oriol, Warden, Pete, Wattenberg, Martin, Wicke, Martin, Yu, Yuan, and Zheng, Xiaoqiang. TensorFlow: Large-scale machine learning on heterogeneous systems, 2015. URL http://tensorflow.org/. Software available from tensorflow.org.
-
(2015)
TensorFlow: Large-scale Machine Learning on Heterogeneous Systems
-
-
Abadi, M.1
Agarwal, A.2
Barham, P.3
Brevdo, E.4
Chen, Z.5
Citro, C.6
Corrado, G.S.7
Davis, A.8
Dean, J.9
Devin, M.10
Ghemawat, S.11
Goodfellow, I.12
Harp, A.13
Irving, G.14
Isard, M.15
Jia, Y.16
Jozefowicz, R.17
Kaiser, L.18
Kudlur, M.19
Levenberg, J.20
Mané, D.21
Monga, R.22
Moore, S.23
Murray, D.24
Olah, C.25
Schuster, M.26
Shlens, J.27
Steiner, B.28
Sutskever, I.29
Talwar, K.30
Tucker, P.31
Vanhoucke, V.32
Vasudevan, V.33
Viégas, F.34
Vinyals, O.35
Warden, P.36
Wattenberg, M.37
Wicke, M.38
Yu, Y.39
Zheng, X.40
more..
-
2
-
-
84971463350
-
-
Amodei, Dario, Anubhai, Rishita, Battenberg, Eric, Case, Carl, Casper, Jarcd, Catanzaro, Bryan, Chen, Jingdong, Chrzanowski, Mike, Coates, Adam, Diamos, Greg, et al. Deep speech 2: End-to-end speech recognition in english and mandarin. arXiv preprint arXiv: 1512.02595, 2015.
-
(2015)
Deep Speech 2: End-to-end Speech Recognition in English and Mandarin
-
-
Amodei, D.1
Anubhai, R.2
Battenberg, E.3
Case, C.4
Casper, J.5
Catanzaro, B.6
Chen, J.7
Chrzanowski, M.8
Coates, A.9
Diamos, G.10
-
3
-
-
4444257069
-
Praat, a system for doing phonetics by computer
-
Boersma, Paulus Petrus Gerardus et al. Praat, a system for doing phonetics by computer. Glot international, 5, 2002.
-
(2002)
Glot International
, pp. 5
-
-
Boersma, P.P.G.1
-
4
-
-
85037362563
-
-
Bradbury, James, Merity, Stephen, Xiong, Caiming, and Socher, Richard. Quasi-recurrent neural networks. arXiv preprint arXiv:1611.01576, 2016.
-
(2016)
Quasi-recurrent Neural Networks
-
-
Bradbury, J.1
Merity, S.2
Xiong, C.3
Socher, R.4
-
5
-
-
84939821078
-
-
Chung, Junyoung, Gulcehre, Caglar, Cho, KyungHyun, and Bengio, Yoshua. Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv preprint arXiv: 1412.3555, 2014.
-
(2014)
Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling
-
-
Chung, J.1
Gulcehre, C.2
Cho, K.3
Bengio, Y.4
-
6
-
-
85048676855
-
Persistent mns: Stashing recurrent weights on-chip
-
Diamos, Greg, Sengupta, Shubho, Catanzaro, Bryan, Chrzanowski, Mike, Coates, Adam, Elsen, Erich, Engel, Jesse, Hannun, Awni, and Satheesh, Sanjeev. Persistent mns: Stashing recurrent weights on-chip. In Proceedings of The 33rd International Conference on Machine Learning, pp. 2024-2033, 2016.
-
(2016)
Proceedings of the 33rd International Conference on Machine Learning
, pp. 2024-2033
-
-
Diamos, G.1
Sengupta, S.2
Catanzaro, B.3
Chrzanowski, M.4
Coates, A.5
Elsen, E.6
Engel, J.7
Hannun, A.8
Satheesh, S.9
-
8
-
-
34250704813
-
Connectionist temporal classification: Labelling unsegmented sequence data with recurrent neural networks
-
New York, NY, USA, ACM
-
Graves, Alex, Fernández, Santiago, Gomez, Faustino, and Schmidhuber, Jürgen. Connectionist temporal classification: Labelling unsegmented sequence data with recurrent neural networks. In Proceedings of the 23rd International Conference on Machine Learning, ICML'06, pp. 369-376, New York, NY, USA, 2006. ACM.
-
(2006)
Proceedings of the 23rd International Conference on Machine Learning, ICML'06
, pp. 369-376
-
-
Graves, A.1
Fernández, S.2
Gomez, F.3
Schmidhuber, J.4
-
10
-
-
85039166060
-
-
Mehri, Soroush, Kumar, Kundan, Gulrajani, Ishaan, Kumar, Rithesh, Jain, Shubham, Sotelo, Jose, Courville, Aaron, and Bengio, Yoshua. Samplernn: An unconditional end-to-end neural audio generation model. arXiv preprint arXiv:1612.07837, 2016.
-
(2016)
Samplernn: An Unconditional End-to-end Neural Audio Generation Model
-
-
Mehri, S.1
Kumar, K.2
Gulrajani, I.3
Kumar, R.4
Jain, S.5
Sotelo, J.6
Courville, A.7
Bengio, Y.8
-
11
-
-
84976902575
-
World: A vocoder-based high-quality speech synthesis system for real-time applications
-
Morise, Masanori, Yokomori, Fumiya, and Ozawa, Kenji. World: a vocoder-based high-quality speech synthesis system for real-time applications. IEICE TRANSAC-TIONS on Information and Systems, 99(7):1877-1884, 2016.
-
(2016)
IEICE TRANSAC-TIONS on Information and Systems
, vol.99
, Issue.7
, pp. 1877-1884
-
-
Morise, M.1
Yokomori, F.2
Ozawa, K.3
-
13
-
-
85039156182
-
-
Paine, Tom Le, Khorrami, Pooya, Chang, Shiyu, Zhang, Yang, Ramachandran, Prajit, Hasegawa-Johnson, Mark A, and Huang, Thomas S. Fast wavenet generation algorithm. arXiv preprint arXiv:1611.09482, 2016.
-
(2016)
Fast Wavenet Generation Algorithm
-
-
Paine, T.L.1
Khorrami, P.2
Chang, S.3
Zhang, Y.4
Ramachandran, P.5
Hasegawa-Johnson, M.A.6
Huang, T.S.7
-
14
-
-
85048678744
-
Multi-output rnn-lstm for multiple speaker speech synthesis with α-interpolation model
-
Pascual, Santiago and Bonafonte, Antonio. Multi-output rnn-lstm for multiple speaker speech synthesis with α-interpolation model. way, 1000:2, 2016.
-
(2016)
Way
, vol.1000
, pp. 2
-
-
Pascual, S.1
Bonafonte, A.2
-
15
-
-
84984782848
-
The blizzard challenge 2013indian language task
-
Prahallad, Kishore, Vadapalli, Anandaswarup, Elluru, Naresh, et al. The blizzard challenge 2013indian language task. In In Blizzard Challenge Workshop 2013, 2013.
-
(2013)
Blizzard Challenge Workshop 2013
-
-
Prahallad, K.1
Vadapalli, A.2
Elluru, N.3
-
16
-
-
84946032010
-
Grapheme-to-phoneme conversion using long short-term memory recurrent neural networks
-
IEEE
-
Rao, Kanishka, Peng, Fuchun, Sak, Hasim, and Beaufays, Françoise. Grapheme-to-phoneme conversion using long short-term memory recurrent neural networks. In Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference on, pp. 4225-4229. IEEE, 2015.
-
(2015)
Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference on
, pp. 4225-4229
-
-
Rao, K.1
Peng, F.2
Sak, H.3
Beaufays, F.4
-
17
-
-
80051607565
-
Crowdmos: An approach for crowdsourcing mean opinion score studies
-
IEEE
-
Ribeiro, Flávio, Florencio, Dinei, Zhang, Cha, and Seltzer, Michael. Crowdmos: An approach for crowdsourcing mean opinion score studies. In Acoustics, Speech and Signal Processing (ICASSP), 2011 IEEE International Conference on, pp. 2416-2419. IEEE, 2011.
-
(2011)
Acoustics, Speech and Signal Processing (ICASSP), 2011 IEEE International Conference on
, pp. 2416-2419
-
-
Ribeiro, F.1
Florencio, D.2
Zhang, C.3
Seltzer, M.4
-
18
-
-
84994213378
-
A template-based approach for speech synthesis intonation generation using lstms
-
Ronanki, Srikanth, Henter, Gustav Eje, Wu, Zhizheng, and King, Simon. A template-based approach for speech synthesis intonation generation using lstms. Interspeech 2016, pp. 2463-2467, 2016.
-
(2016)
Interspeech 2016
, pp. 2463-2467
-
-
Ronanki, S.1
Henter, G.E.2
Wu, Z.3
King, S.4
-
19
-
-
85122685393
-
Char2wav: End-to-end speech synthesis
-
Sotelo, Jose, Mehri, Soroush, Kumar, Kundan, Santos, Joao Felipe, Kastner, Kyle, Courville, Aaron, and Bengio, Yoshua. Char2wav: End-to-end speech synthesis. In ICLR 2017 workshop submission, 2017. URL https://openreview.net/forum?id=BlVWyySKx.
-
(2017)
ICLR 2017 Workshop Submission
-
-
Sotelo, J.1
Mehri, S.2
Kumar, K.3
Santos, J.F.4
Kastner, K.5
Courville, A.6
Bengio, Y.7
-
21
-
-
84925160976
-
-
Cambridge University Press, New York, NY, USA, 1st edition, 9780521899277
-
Taylor, Paul. Text-to-Speech Synthesis. Cambridge University Press, New York, NY, USA, 1st edition, 2009. ISBN 0521899273, 9780521899277.
-
(2009)
Text-to-speech Synthesis
-
-
Taylor, P.1
-
23
-
-
85017259342
-
Wavenet: A generative model for raw audio
-
1609.03499
-
van den Oord, Aäron, Dieleman, Sander, Zen, Heiga, Simonyan, Karen, Vinyals, Oriol, Graves, Alex, Kalchbrenner, Nal, Senior, Andrew, and Kavukcuoglu, Koray. Wavenet: A generative model for raw audio. CoRR abs/1609.03499, 2016.
-
(2016)
CoRR
-
-
Van Den Oord, A.1
Dieleman, S.2
Zen, H.3
Simonyan, K.4
Vinyals, O.5
Graves, A.6
Kalchbrenner, N.7
Senior, A.8
Kavukcuoglu, K.9
-
26
-
-
84946045510
-
Unidirectional long short-term memory recurrent neural network with recurrent output layer for low-latency speech synthesis
-
IEEE
-
Zen, Heiga and Sak, Hasim. Unidirectional long short-term memory recurrent neural network with recurrent output layer for low-latency speech synthesis. In Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference on, pp. 4470-4474. IEEE, 2015.
-
(2015)
Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference on
, pp. 4470-4474
-
-
Zen, H.1
Sak, H.2
-
27
-
-
84890490547
-
Statistical parametric speech synthesis using deep neural networks
-
Zen, Heiga, Senior, Andrew, and Schuster, Mike. Statistical parametric speech synthesis using deep neural networks. In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 7962-7966, 2013.
-
(2013)
Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP)
, pp. 7962-7966
-
-
Zen, H.1
Senior, A.2
Schuster, M.3
|