SCOPUS 정보 검색 플랫폼

ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings

Volumn 2016-May, Issue , 2016, Pages 5955-5959

Personalized speech recognition on mobile devices

(11) McGraw, Ian a Prabhavalkar, Rohit a Alvarez, Raziel a Arenas, Montse Gonzalez a Rao, Kanishka a Rybach, David a Alsharif, Ouais a Sak, Hasim a Gruenstein, Alexander a Beaufays, Francoise a Parada, Carolina a

a GOOGLE INC (United States)

Author keywords

CTC; embedded speech recognition; LSTM; model compression; quantization

Indexed keywords

EID: 84973402464 PISSN: 15206149 EISSN: None Source Type: Conference Proceeding
DOI: 10.1109/ICASSP.2016.7472820 Document Type: Conference Paper

Times cited : (182)

References (25)

1
- 84906251664
- Accurate and compact large vocabulary speech recognition on mobile devices
- ISCA
- Xin Lei, Andrew Senior, Alexander Gruenstein, and Jeffrey Sorensen, "Accurate and compact large vocabulary speech recognition on mobile devices., " in INTERSPEECH. 2013, pp. 662-665, ISCA
- (2013) INTERSPEECH. , pp. 662-665
- Lei, X.¹ Senior, A.² Gruenstein, A.³ Sorensen, J.⁴

2
- 34250704813
- Connectionist temporal classification: Labelling unsegmented sequence data with recurrent neural networks
- Alex Graves, Santiago Fernández, Faustino Gomez, and Jürgen Schmidhuber, "Connectionist temporal classification: Labelling unsegmented sequence data with recurrent neural networks, " in ICML, 2006, pp. 369-376
- (2006) ICML , pp. 369-376
- Graves, A.¹ Fernández, S.² Gomez, F.³ Schmidhuber, J.⁴

3
- 70349213445
- Lattice-based optimization of sequence classification criteria for neural-network acoustic modeling
- IEEE
- Brian Kingsbury, "Lattice-based optimization of sequence classification criteria for neural-network acoustic modeling, " in ICASSP. 2009, pp. 3761-3764, IEEE
- (2009) ICASSP , pp. 3761-3764
- Kingsbury, B.¹

4
- 84906227589
- Restructuring of deep neural network acoustic models with singular value decomposition
- Jian Xue, Jinyu Li, and Yifan Gong, "Restructuring of deep neural network acoustic models with singular value decomposition, " in INTERSPEECH, 2013, pp. 2365-2369
- (2013) INTERSPEECH , pp. 2365-2369
- Xue, J.¹ Li, J.² Gong, Y.³

5
- 84973402069
- On the compression of recurrent neural networks with an application to LVCSR acoustic modeling for embedded speech recognition
- IEEE
- Rohit Prabhavalkar, Ouais Alsharif, Antoine Bruguier, and Ian McGraw, "On the compression of recurrent neural networks with an application to LVCSR acoustic modeling for embedded speech recognition, " in ICASSP. 2016, IEEE
- (2016) ICASSP.
- Prabhavalkar, R.¹ Alsharif, O.² Bruguier, A.³ McGraw, I.⁴

6
- 84959152523
- Locallyconnected and convolutional neural networks for small footprint speaker recognition
- ISCA
- Yu-hsin Chen, Ignacio Lopez-Moreno, Tara N. Sainath, Mirkó Visontai, Raziel Alvarez, and Carolina Parada, "Locallyconnected and convolutional neural networks for small footprint speaker recognition, " in INTERSPEECH. 2015, pp. 1136-1140, ISCA
- (2015) INTERSPEECH. , pp. 1136-1140
- Chen, Y.¹ Lopez-Moreno, I.² Sainath, T.N.³ Visontai, M.⁴ Alvarez, R.⁵ Parada, C.⁶

7
- 84959104369
- Compressing deep neural networks using a rank-constrained topology
- ISCA
- Preetum Nakkiran, Raziel Alvarez, Rohit Prabhavalkar, and Carolina Parada, "Compressing deep neural networks using a rank-constrained topology, " in INTERSPEECH. 2015, pp. 1473-1477, ISCA
- (2015) INTERSPEECH. , pp. 1473-1477
- Nakkiran, P.¹ Alvarez, R.² Prabhavalkar, R.³ Parada, C.⁴

8
- 84965177696
- Structured transforms for small-footprint deep learning
- Vikas Sindhwani, Tara N. Sainath, and Sanjiv Kumar, "Structured transforms for small-footprint deep learning, " in NIPS (to appear), 2015
- (2015) NIPS (To Appear)
- Sindhwani, V.¹ Sainath, T.N.² Kumar, S.³

9
- 84890454527
- Low-rank matrix factorization for deep neural network training with high-dimensional output targets
- IEEE
- Tara N. Sainath, Brian Kingsbury, Vikas Sindhwani, Ebru Arisoy, and Bhuvana Ramabhadran, "Low-rank matrix factorization for deep neural network training with high-dimensional output targets, " in ICASSP. 2013, pp. 6655-6659, IEEE
- (2013) ICASSP. , pp. 6655-6659
- Sainath, T.N.¹ Kingsbury, B.² Sindhwani, V.³ Arisoy, E.⁴ Ramabhadran, B.⁵

10
- 84946014836
- Small-footprint high-performance deep neural network-based speech recognition using split-VQ
- IEEE
- Yongqiang Wang, Jinyu Li, and Yifan Gong, "Small-footprint high-performance deep neural network-based speech recognition using split-VQ, " in ICASSP. 2015, pp. 4984-4988, IEEE
- (2015) ICASSP. , pp. 4984-4988
- Wang, Y.¹ Li, J.² Gong, Y.³

11
- 84959121080
- Transferring knowledge from a RNN to a DNN
- ISCA
- William Chan, Nan Rosemary Ke, and Ian Lane, "Transferring knowledge from a RNN to a DNN, " in INTERSPEECH. 2015, ISCA
- (2015) INTERSPEECH.
- Chan, W.¹ Rosemary Ke, N.² Lane, I.³

12
- 84946080066
- Improved recognition of contact names in voice commands
- Petar Aleksic, Cyril Allauzen, David Elson, Aleksandar Kracun, Diego Melendo Casado, and Pedro J. Moreno, "Improved recognition of contact names in voice commands, " in ICASSP, 2015, pp. 5172-5175
- (2015) ICASSP , pp. 5172-5175
- Aleksic, P.¹ Allauzen, C.² Elson, D.³ Kracun, A.⁴ Melendo Casado, D.⁵ Moreno, P.J.⁶

13
- 84959101360
- Composition-based on-the-fly rescoring for salient n-gram biasing
- ISCA
- Keith Hall, Eunjoon Cho, Cyril Allauzen, Françoise Beaufays, Noah Coccaro, Kaisuke Nakajima, Michael Riley, Brian Roark, David Rybach, and Linda Zhang, "Composition-based on-the-fly rescoring for salient n-gram biasing, " in INTERSPEECH. 2015, ISCA
- (2015) INTERSPEECH.
- Hall, K.¹ Cho, E.² Allauzen, C.³ Beaufays, F.⁴ Coccaro, N.⁵ Nakajima, K.⁶ Riley, M.⁷ Roark, B.⁸ Rybach, D.⁹ Zhang, L.¹⁰

14
- 84910046405
- Long short-term memory recurrent neural network architectures for large scale acoustic modeling
- ISCA
- Haşim Sak, Andrew Senior, and Françoise Beaufays, "Long short-term memory recurrent neural network architectures for large scale acoustic modeling, " in INTERSPEECH. 2014, pp. 338-342, ISCA
- (2014) INTERSPEECH. , pp. 338-342
- Sak, H.¹ Senior, A.² Beaufays, F.³

15
- 84946084790
- Learning acoustic frame labeling for speech recognition with recurrent neural networks
- Haşim Sak, Andrew Senior, Kanishka Rao, Ozan Irsoy, Alex Graves, Françoise Beaufays, and Johan Schalkwyk, "Learning acoustic frame labeling for speech recognition with recurrent neural networks, " in ICASSP, 2015, pp. 4280-4284
- (2015) ICASSP , pp. 4280-4284
- Sak, H.¹ Senior, A.² Rao, K.³ Irsoy, O.⁴ Graves, A.⁵ Beaufays, F.⁶ Schalkwyk, J.⁷

16
- 84959112739
- Fast and accurate recurrent neural network acoustic models for speech recognition
- ISCA
- Haşim Sak, Andrew Senior, Kanishka Rao, and Françoise Beaufays, "Fast and accurate recurrent neural network acoustic models for speech recognition, " in INTERSPEECH. 2015, pp. 1468-1472, ISCA
- (2015) INTERSPEECH. , pp. 1468-1472
- Sak, H.¹ Senior, A.² Rao, K.³ Beaufays, F.⁴

17
- 84865720088
- Unary data structures for language models
- ISCA
- Jeffrey Sorensen and Cyril Allauzen, "Unary data structures for language models, " in INTERSPEECH. 2011, ISCA
- (2011) INTERSPEECH.
- Sorensen, J.¹ Allauzen, C.²

18
- 84877760312
- Large scale distributed deep networks
- Jeffrey Dean, Greg S. Corrado, Rajat Monga, Kai Chen, Matthieu Devin, Quoc V. Le, Mark Z. Mao, Marc'Aurelio Ranzato, Andrew Senior, Paul Tucker, Ke Yang, and Andrew Y. Ng, "Large scale distributed deep networks, " in NIPS, 2012, pp. 1223-1231
- (2012) NIPS , pp. 1223-1231
- Dean, J.¹ Corrado, G.S.² Monga, R.³ Chen, K.⁴ Devin, M.⁵ Le, Q.V.⁶ Mao, M.Z.⁷ Ranzato, M.⁸ Senior, A.⁹ Tucker, P.¹⁰ Yang, K.¹¹ Ng, A.Y.¹²

19
- 84910072094
- Sequence discriminative distributed training of long short-term memory recurrent neural networks
- Haşim Sak, Oriol Vinyals, Georg Heigold, Andrew Senior, Erik McDermott, Rajat Monga, and Mark Mao, "Sequence discriminative distributed training of long short-term memory recurrent neural networks, " in INTERSPEECH, 2014, pp. 1209-1213
- (2014) INTERSPEECH , pp. 1209-1213
- Sak, H.¹ Vinyals, O.² Heigold, G.³ Senior, A.⁴ McDermott, E.⁵ Monga, R.⁶ Mao, M.⁷

20
- 85013717482
- Academic Press, Inc., Orlando, FL, USA
- Alan C. Bovik, Handbook of Image and Video Processing (Communications, Networking and Multimedia), Academic Press, Inc., Orlando, FL, USA, 2005
- (2005) Handbook of Image and Video Processing (Communications, Networking and Multimedia)
- Alan, C.¹ Bovik²

21
- 0003959189
- Springer US
- Allen Gersho and Robert M. Gray, Vector Quantization and Signal Compression, Springer US, 1992
- (1992) Vector Quantization and Signal Compression
- Gersho, A.¹ Gray, R.M.²

22
- 84877740429
- Improving the speed of neural networks on cpus
- Vincent Vanhoucke, Andrew Senior, and Mark Mao, "Improving the speed of neural networks on cpus, " in Deep Learning and Unsupervised Feature LearningWorkshop, NIPS 2011, 2011
- (2011) Deep Learning and Unsupervised Feature LearningWorkshop, NIPS 2011
- Vanhoucke, V.¹ Senior, A.² Mao, M.³

23
- 84865772214
- Bayesian language model interpolation for mobile speech input
- Cyril Allauzen and Michael Riley, "Bayesian language model interpolation for mobile speech input, " in INTERSPEECH, 2011, pp. 1429-1432
- (2011) INTERSPEECH , pp. 1429-1432
- Allauzen, C.¹ Riley, M.²

24
- 85075929453
- Speech recognition with weighted finite-state transducers
- Jacob Benesty, M. Sondhi, and Yiteng Huang, Eds., chapter 28 Springer
- Mehryar Mohri, Fernando Pereira, and Michael Riley, "Speech recognition with weighted finite-state transducers, " in Handbook of Speech Processing, Jacob Benesty, M. Sondhi, and Yiteng Huang, Eds., chapter 28, pp. 559-582. Springer, 2008
- (2008) Handbook of Speech Processing , pp. 559-582
- Mohri, M.¹ Pereira, F.² Riley, M.³

25
- 84946032010
- Grapheme-to-phoneme conversion using long shortterm memory recurrent neural networks
- Kanishka Rao, Fuchun Peng, Haşim Sak, and Françoise Beaufays, "Grapheme-to-phoneme conversion using long shortterm memory recurrent neural networks, " in ICASSP, 2015.
- (2015) ICASSP
- Rao, K.¹ Peng, F.² Sak, H.³ Beaufays, F.⁴

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.