SCOPUS 정보 검색 플랫폼

Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH

Volumn 2015-January, Issue , 2015, Pages 1101-1105

On speaker adaptation of long short-term memory recurrent neural networks

(2) Miao, Yajie a Metze, Florian a

a Carnegie Mellon University (United States)

Author keywords

Acoustic modeling; Long short term memory; Recurrent neural network; Speaker adaptation

Indexed keywords

BRAIN; SPEECH COMMUNICATION; SPEECH RECOGNITION;

ACOUSTIC MODEL; ADAPTATION TECHNIQUES; BENCHMARK DATASETS; LONG SHORT TERM MEMORY; RECURRENT NEURAL NETWORK (RNN); SPEAKER ADAPTATION; SPEAKER DEPENDENTS; TEMPORAL DYNAMICS;

RECURRENT NEURAL NETWORKS;

EID: 84938725974 PISSN: 2308457X EISSN: 19909772 Source Type: Conference Proceeding
DOI: None Document Type: Conference Paper

Times cited : (47)

References (34)

1
- 84055222005
- Context-dependentpre-trained deep neural networks for large-vocabulary speechrecognition
- G. E. Dahl, D. Yu, L. Deng, and A. Acero, "Context-dependentpre-trained deep neural networks for large-vocabulary speechrecognition, " Audio, Speech, and Language Processing, IEEETransactions on, vol. 20, no. 1, pp. 30-42, 2012.
- (2012) Audio, Speech, and Language Processing, IEEETransactions on , vol.20 , Issue.1 , pp. 30-42
- Dahl, G.E.¹ Yu, D.² Deng, L.³ Acero, A.⁴

2
- 85032751458
- Deepneural networks for acoustic modeling in speech recognition: Theshared views of four research groups
- G. Hinton, L. Deng, D. Yu, G. E. Dahl, A.-r. Mohamed, N. Jaitly, A. Senior, V. Vanhoucke, P. Nguyen, T. N. Sainath et al., "Deepneural networks for acoustic modeling in speech recognition: Theshared views of four research groups, " Signal Processing Magazine, IEEE, vol. 29, no. 6, pp. 82-97, 2012.
- (2012) Signal Processing Magazine, IEEE , vol.29 , Issue.6 , pp. 82-97
- Hinton, G.¹ Deng, L.² Yu, D.³ Dahl, G.E.⁴ Mohamed, A.-R.⁵ Jaitly, N.⁶ Senior, A.⁷ Vanhoucke, V.⁸ Nguyen, P.⁹ Sainath, T.N.¹⁰

3
- 84893701756
- Deep maxout networks for lowresourcespeech recognition
- Y. Miao, F. Metze, and S. Rawat, "Deep maxout networks for lowresourcespeech recognition, " in Automatic Speech Recognitionand Understand ing (ASRU), 2013 IEEE Workshop on. IEEE, 2013, pp. 398-403.
- (2013) Automatic Speech Recognitionand Understand Ing (ASRU), 2013 IEEE Workshop On. IEEE , pp. 398-403
- Miao, Y.¹ Metze, F.² Rawat, S.³

4
- 84890525984
- Deep convolutional neural networks for lvcsr
- T. N. Sainath, A.-r. Mohamed, B. Kingsbury, and B. Ramabhadran, "Deep convolutional neural networks for lvcsr, " in Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE InternationalConference on. IEEE, 2013, pp. 8614-8618.
- (2013) Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE InternationalConference On. IEEE , pp. 8614-8618
- Sainath, T.N.¹ Mohamed, A.-R.² Kingsbury, B.³ Ramabhadran, B.⁴

5
- 84911473441
- Convolutional neural networks for speech recognition
- O. Abdel-Hamid, A.-R. Mohamed, H. Jiang, L. Deng, G. Penn, and D. Yu, "Convolutional neural networks for speech recognition, "IEEE/ACM Transactions on Audio, Speech and LanguageProcessing (TASLP), vol. 22, no. 10, pp. 1533-1545, 2014.
- (2014) IEEE/ACM Transactions on Audio, Speech and LanguageProcessing (TASLP) , vol.22 , Issue.10 , pp. 1533-1545
- Abdel-Hamid, O.¹ Mohamed, A.-R.² Jiang, H.³ Deng, L.⁴ Penn, G.⁵ Yu, D.⁶

6
- 84910028405
- Improving language-universal feature extractionwith deep maxout and convolutional neural networks
- Y. Miao and F. Metze, "Improving language-universal feature extractionwith deep maxout and convolutional neural networks, " inFifteenth Annual Conference of the International Speech CommunicationAssociation (INTERSPEECH). ISCA, 2014.
- (2014) InFifteenth Annual Conference of the International Speech CommunicationAssociation (INTERSPEECH). ISCA
- Miao, Y.¹ Metze, F.²

7
- 84905265980
- Joint training of convolutionaland non-convolutional neural networks
- H. Soltau, G. Saon, and T. N. Sainath, "Joint training of convolutionaland non-convolutional neural networks, " in Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE InternationalConference on. IEEE, 2014, pp. 5572-5576.
- (2014) Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE InternationalConference On. IEEE , pp. 5572-5576
- Soltau, H.¹ Saon, G.² Sainath, T.N.³

8
- 84890543083
- Speech recognitionwith deep recurrent neural networks
- A. Graves, A.-R. Mohamed, and G. Hinton, "Speech recognitionwith deep recurrent neural networks, " in Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conferenceon. IEEE, 2013, pp. 6645-6649.
- (2013) Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conferenceon. IEEE , pp. 6645-6649
- Graves, A.¹ Mohamed, A.-R.² Hinton, G.³

9
- 84878409063
- Recurrent neural networks for noise reduction in robustasr
- A. L. Maas, Q. V. Le, T. M. O'Neil, O. Vinyals, P. Nguyen, and A. Y. Ng, "Recurrent neural networks for noise reduction in robustasr, " in Thirteenth Annual Conference of the International SpeechCommunication Association (INTERSPEECH). ISCA, 2012.
- (2012) Thirteenth Annual Conference of the International SpeechCommunication Association (INTERSPEECH). ISCA
- Maas, A.L.¹ Le, Q.V.² O'Neil, T.M.³ Vinyals, O.⁴ Nguyen, P.⁵ Ng, A.Y.⁶

10
- 0028392483
- Learning long-term dependencieswith gradient descent is difficult
- Y. Bengio, P. Simard, and P. Frasconi, "Learning long-term dependencieswith gradient descent is difficult, " Neural Networks, IEEE Transactions on, vol. 5, no. 2, pp. 157-166, 1994.
- (1994) Neural Networks, IEEE Transactions on , vol.5 , Issue.2 , pp. 157-166
- Bengio, Y.¹ Simard, P.² Frasconi, P.³

11
- 0031573117
- Long short-term memory
- S. Hochreiter and J. Schmidhuber, "Long short-term memory, "Neural computation, vol. 9, no. 8, pp. 1735-1780, 1997.
- (1997) Neural Computation , vol.9 , Issue.8 , pp. 1735-1780
- Hochreiter, S.¹ Schmidhuber, J.²

12
- 84893701254
- Hybrid speech recognitionwith deep bidirectional lstm
- A. Graves, N. Jaitly, and A.-R. Mohamed, "Hybrid speech recognitionwith deep bidirectional lstm, " in Automatic Speech Recognitionand Understand ing (ASRU), 2013 IEEE Workshop on. IEEE, 2013, pp. 273-278.
- (2013) Automatic Speech Recognitionand Understand Ing (ASRU), 2013 IEEE Workshop On. IEEE , pp. 273-278
- Graves, A.¹ Jaitly, N.² Mohamed, A.-R.³

13
- 84910046405
- Long short-term memory recurrentneural network architectures for large scale acoustic modeling
- H. Sak, A. Senior, and F. Beaufays, "Long short-term memory recurrentneural network architectures for large scale acoustic modeling, "in Fifteenth Annual Conference of the International SpeechCommunication Association (INTERSPEECH). ISCA, 2014.
- (2014) Fifteenth Annual Conference of the International SpeechCommunication Association (INTERSPEECH). ISCA
- Sak, H.¹ Senior, A.² Beaufays, F.³

14
- 84910072094
- Sequence discriminative distributedtraining of long short-term memory recurrent neural networks
- H. Sak, O. Vinyals, G. Heigold, A. Senior, E. McDermott, R. Monga, and M. Mao, "Sequence discriminative distributedtraining of long short-term memory recurrent neural networks, "in Fifteenth Annual Conference of the International Speech CommunicationAssociation (INTERSPEECH). ISCA, 2014.
- (2014) Fifteenth Annual Conference of the International Speech CommunicationAssociation (INTERSPEECH). ISCA
- Sak, H.¹ Vinyals, O.² Heigold, G.³ Senior, A.⁴ McDermott, E.⁵ Monga, R.⁶ Mao, M.⁷

15
- 84959127999
- arXiv preprint arXiv: 1410. 4281
- X. Li and X. Wu, "Constructing long short-term memory baseddeep recurrent neural networks for large vocabulary speech recognition, "arXiv preprint arXiv: 1410. 4281, 2014.
- (2014) Constructing Long Short-term Memory Baseddeep Recurrent Neural Networks for Large Vocabulary Speech Recognition
- Li, X.¹ Wu, X.²

16
- 85083953021
- arXiv preprint arXiv: 1301. 3605
- D. Yu, M. L. Seltzer, J. Li, J.-T. Huang, and F. Seide, "Featurelearning in deep neural networks-studies on speech recognitiontasks, " arXiv preprint arXiv: 1301. 3605, 2013.
- (2013) Featurelearning in Deep Neural Networks-studies on Speech Recognitiontasks
- Yu, D.¹ Seltzer, M.L.² Li, J.³ Huang, J.-T.⁴ Seide, F.⁵

17
- 84910068044
- Distributed learning of multilingualdnn feature extractors using GPUs
- Y. Miao, H. Zhang, and F. Metze, "Distributed learning of multilingualdnn feature extractors using gpus, " in Fifteenth AnnualConference of the International Speech Communication Association(INTERSPEECH). ISCA, 2014.
- (2014) Fifteenth AnnualConference of the International Speech Communication Association(INTERSPEECH). ISCA
- Miao, Y.¹ Zhang, H.² Metze, F.³

18
- 84890521103
- Speaker adaptation of context dependent deep neuralnetworks
- H. Liao, "Speaker adaptation of context dependent deep neuralnetworks, " in Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on. IEEE, 2013, pp. 7947-7951.
- (2013) Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference On. IEEE , pp. 7947-7951
- Liao, H.¹

19
- 84890542079
- Kl-divergence regularizeddeep neural network adaptation for improved large vocabularyspeech recognition
- D. Yu, K. Yao, H. Su, G. Li, and F. Seide, "Kl-divergence regularizeddeep neural network adaptation for improved large vocabularyspeech recognition, " in Acoustics, Speech and SignalProcessing (ICASSP), 2013 IEEE International Conference on. IEEE, 2013, pp. 7893-7897.
- (2013) Acoustics, Speech and SignalProcessing (ICASSP), 2013 IEEE International Conference On. IEEE , pp. 7893-7897
- Yu, D.¹ Yao, K.² Su, H.³ Li, G.⁴ Seide, F.⁵

20
- 79959849500
- Comparison of discriminative input and outputtransformations for speaker adaptation in the hybrid nn/hmmsystems
- B. Li and K. C. Sim, "Comparison of discriminative input and outputtransformations for speaker adaptation in the hybrid nn/hmmsystems, " in Eleventh Annual Conference of the InternationalSpeech Communication Association (INTERSPEECH). ISCA, 2010.
- (2010) Eleventh Annual Conference of the InternationalSpeech Communication Association (INTERSPEECH). ISCA
- Li, B.¹ Sim, K.C.²

21
- 84874226579
- Adaptationof context-dependent deep neural networks for automaticspeech recognition
- K. Yao, D. Yu, F. Seide, H. Su, L. Deng, and Y. Gong, "Adaptationof context-dependent deep neural networks for automaticspeech recognition, " in 2012 IEEE Spoken Language TechnologyWorkshop (SLT). IEEE, 2012.
- (2012) 2012 IEEE Spoken Language TechnologyWorkshop (SLT). IEEE
- Yao, K.¹ Yu, D.² Seide, F.³ Su, H.⁴ Deng, L.⁵ Gong, Y.⁶

22
- 84858976070
- Feature engineeringin context-dependent deep neural networks for conversationalspeech transcription
- F. Seide, G. Li, X. Chen, and D. Yu, "Feature engineeringin context-dependent deep neural networks for conversationalspeech transcription, " in Automatic Speech Recognition and Understand ing(ASRU), 2011 IEEE Workshop on. IEEE, 2011, pp. 24-29.
- (2011) Automatic Speech Recognition and Understand Ing(ASRU), 2011 IEEE Workshop On. IEEE , pp. 24-29
- Seide, F.¹ Li, G.² Chen, X.³ Yu, D.⁴

23
- 84906241049
- Improved featureprocessing for deep neural networks
- S. P. Rath, D. Povey, K. Vesely, and J. Cernocky, "Improved featureprocessing for deep neural networks, " in Fourteenth AnnualConference of the International Speech Communication Association(INTERSPEECH). ISCA, 2013, pp. 109-113.
- (2013) Fourteenth AnnualConference of the International Speech Communication Association(INTERSPEECH). ISCA , pp. 109-113
- Rath, S.P.¹ Povey, D.² Vesely, K.³ Cernocky, J.⁴

24
- 84946046160
- Regularizing dnn acousticmodels with Gaussian stochastic neurons
- H. Zhang, Y. Miao, and F. Metze, "Regularizing dnn acousticmodels with Gaussian stochastic neurons, " in Acoustics, Speechand Signal Processing (ICASSP), 2015 IEEE International Conferenceon. IEEE, 2015, pp. 4964-4968.
- (2015) Acoustics, Speechand Signal Processing (ICASSP), 2015 IEEE International Conferenceon. IEEE , pp. 4964-4968
- Zhang, H.¹ Miao, Y.² Metze, F.³

25
- 84893691530
- Speaker adaptationof neural network acoustic models using i-vectors
- G. Saon, H. Soltau, D. Nahamoo, and M. Picheny, "Speaker adaptationof neural network acoustic models using i-vectors, " in AutomaticSpeech Recognition and Understand ing (ASRU), 2013IEEE Workshop on. IEEE, 2013, pp. 55-59.
- (2013) AutomaticSpeech Recognition and Understand Ing (ASRU), 2013IEEE Workshop On. IEEE , pp. 55-59
- Saon, G.¹ Soltau, H.² Nahamoo, D.³ Picheny, M.⁴

26
- 84905259138
- Improving dnn speaker independencewith i-vector inputs
- A. Senior and I. Lopez-Moreno, "Improving dnn speaker independencewith i-vector inputs, " in Acoustics, Speech and SignalProcessing (ICASSP), 2014 IEEE International Conference on. IEEE, 2014, pp. 225-229.
- (2014) Acoustics, Speech and SignalProcessing (ICASSP), 2014 IEEE International Conference On. IEEE , pp. 225-229
- Senior, A.¹ Lopez-Moreno, I.²

27
- 84910031119
- Towards speaker adaptivetraining of deep neural network acoustic models
- Y. Miao, H. Zhang, and F. Metze, "Towards speaker adaptivetraining of deep neural network acoustic models, " in FifteenthAnnual Conference of the International Speech CommunicationAssociation (INTERSPEECH). ISCA, 2014.
- (2014) FifteenthAnnual Conference of the International Speech CommunicationAssociation (INTERSPEECH). ISCA
- Miao, Y.¹ Zhang, H.² Metze, F.³

28
- 84946685505
- Improvements tospeaker adaptive training of deep neural networks
- Y. Miao, L. Jiang, H. Zhang, and F. Metze, "Improvements tospeaker adaptive training of deep neural networks, " in 2014 IEEESpoken Language Technology Workshop (SLT). IEEE, 2014.
- (2014) 2014 IEEESpoken Language Technology Workshop (SLT). IEEE
- Miao, Y.¹ Jiang, L.² Zhang, H.³ Metze, F.⁴

29
- 0041965934
- Learningprecise timing with lstm recurrent networks
- F. A. Gers, N. N. Schraudolph, and J. Schmidhuber, "Learningprecise timing with lstm recurrent networks, " The Journal of MachineLearning Research, vol. 3, pp. 115-143, 2003.
- (2003) The Journal of MachineLearning Research , vol.3 , pp. 115-143
- Gers, F.A.¹ Schraudolph, N.N.² Schmidhuber, J.³

30
- 84858953642
- The kaldi speech recognitiontoolkit
- D. Povey, A. Ghoshal, G. Boulianne, L. Burget, O. Glembek, N. Goel, M. Hannemann, P. Motlcek, Y. Qian, P. Schwarz, J. Silovský, G. Stemmer, and K. Veselý, "The kaldi speech recognitiontoolkit, " in Automatic Speech Recognition and Understand ing(ASRU), 2011 IEEE Workshop on. IEEE, 2011, pp. 1-4.
- (2011) Automatic Speech Recognition and Understand Ing(ASRU), 2011 IEEE Workshop On. IEEE , pp. 1-4
- Povey, D.¹ Ghoshal, A.² Boulianne, G.³ Burget, L.⁴ Glembek, O.⁵ Goel, N.⁶ Hannemann, M.⁷ Motlcek, P.⁸ Qian, Y.⁹ Schwarz, P.¹⁰ Silovský, J.¹¹ Stemmer, G.¹² Veselý, K.¹³

31
- 0032050110
- Maximum likelihood linear transformations forhmm-based speech recognition
- M. J. Gales, "Maximum likelihood linear transformations forhmm-based speech recognition, " Computer speech & language, vol. 12, no. 2, pp. 75-98, 1998.
- (1998) Computer Speech & Language , vol.12 , Issue.2 , pp. 75-98
- Gales, M.J.¹

32
- 33745805403
- A fast learning algorithmfor deep belief nets
- G. Hinton, S. Osindero, and Y.-W. Teh, "A fast learning algorithmfor deep belief nets, " Neural computation, vol. 18, no. 7, pp. 1527-1554, 2006.
- (2006) Neural Computation , vol.18 , Issue.7 , pp. 1527-1554
- Hinton, G.¹ Osindero, S.² Teh, Y.-W.³

33
- 84906283232
- Using conversational word burstsin spoken term detection
- J. Chiu and A. Rudnicky, "Using conversational word burstsin spoken term detection. " in Fourteenth Annual Conference ofthe International Speech Communication Association (INTERSPEECH). ISCA, 2013.
- (2013) Fourteenth Annual Conference Ofthe International Speech Communication Association (INTERSPEECH). ISCA
- Chiu, J.¹ Rudnicky, A.²

34
- 84910068915
- Combination of fst and cn search in spoken term detection
- J. Chiu, Y. Wang, J. Trmal, D. Povey, G. Chen, and A. Rudnicky, "Combination of fst and cn search in spoken term detection, " inFifteenth Annual Conference of the International Speech CommunicationAssociation (INTERSPEECH). ISCA, 2014.
- (2014) Fifteenth Annual Conference of the International Speech CommunicationAssociation (INTERSPEECH). ISCA
- Chiu, J.¹ Wang, Y.² Trmal, J.³ Povey, D.⁴ Chen, G.⁵ Rudnicky, A.⁶

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.