SCOPUS 정보 검색 플랫폼

ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings

Volumn 2016-May, Issue , 2016, Pages 5280-5284

Speaker-aware training of LSTM-RNNS for acoustic modelling

(8) Tan, Tian a Qian, Yanmin a,b Yu, Dong c Kundu, Souvik d Lu, Liang e Sim, Khe Chai d Xiao, Xiong f Zhang, Yu g

a SHANGHAI JIAO TONG UNIVERSITY (China)

b UNIVERSITY OF CAMBRIDGE (United Kingdom)

c MICROSOFT RESEARCH (United States)

d NATIONAL UNIVERSITY OF SINGAPORE (Singapore)

e UNIVERSITY OF EDINBURGH (United Kingdom)

f NANYANG TECHNOLOGICAL UNIVERSITY (Singapore)

g MASSACHUSETTS INSTITUTE OF TECHNOLOGY (United States)

Author keywords

i vector; LSTM RNNs; speaker adaptation; speaker aware training; speaking rate

Indexed keywords

EID: 84973380342 PISSN: 15206149 EISSN: None Source Type: Conference Proceeding
DOI: 10.1109/ICASSP.2016.7472685 Document Type: Conference Paper

Times cited : (49)

References (31)

1
- 0009296228
- Connectionist speech recognition: A hybrid approach
- Herve A Bourlard and Nelson Morgan, Connectionist speech recognition: a hybrid approach, vol. 247, Springer Science & Business Media, 2012.
- (2012) Springer Science & Business Media , vol.247
- Bourlard, H.A.¹ Morgan, N.²

2
- 85032751458
- Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups
- Geoffrey Hinton, Li Deng, Dong Yu, George E Dahl, Abdelrahman Mohamed, Navdeep Jaitly, Andrew Senior, Vincent Vanhoucke, Patrick Nguyen, Tara N Sainath, et al., "Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups, " Signal Processing Magazine, IEEE, vol. 29, no. 6, pp. 82-97, 2012.
- (2012) Signal Processing Magazine, IEEE , vol.29 , Issue.6 , pp. 82-97
- Hinton, G.¹ Deng, L.² Yu, D.³ Dahl, G.E.⁴ Mohamed, A.⁵ Jaitly, N.⁶ Senior, A.⁷ Vanhoucke, V.⁸ Nguyen, P.⁹ Sainath, T.N.¹⁰

3
- 0028392483
- Learning long-term dependencies with gradient descent is difficult
- Yoshua Bengio, Patrice Simard, and Paolo Frasconi, "Learning long-term dependencies with gradient descent is difficult, " Neural Networks, IEEE Transactions on, vol. 5, no. 2, pp. 157-166, 1994.
- (1994) Neural Networks, IEEE Transactions on , vol.5 , Issue.2 , pp. 157-166
- Bengio, Y.¹ Simard, P.² Frasconi, P.³

4
- 0031573117
- Long short-term memory
- Sepp Hochreiter and Jürgen Schmidhuber, "Long short-term memory, " Neural computation, vol. 9, no. 8, pp. 1735-1780, 1997.
- (1997) Neural Computation , vol.9 , Issue.8 , pp. 1735-1780
- Hochreiter, S.¹ Schmidhuber, J.²

5
- 84910046405
- Long short-term memory recurrent neural network architectures for large scale acoustic modeling
- Hasim Sak, Andrew Senior, and Françoise Beaufays, "Long short-term memory recurrent neural network architectures for large scale acoustic modeling, " in INTERSPEECH, 2014.
- (2014) INTERSPEECH
- Sak, H.¹ Senior, A.² Beaufays, F.³

6
- 84946083498
- Constructing long short-term memory based deep recurrent neural networks for large vocabulary speech recognition
- Xiangang Li and Xihong Wu, "Constructing long short-term memory based deep recurrent neural networks for large vocabulary speech recognition, " in ICASSP. IEEE, 2015, pp. 4520-4524.
- (2015) ICASSP. IEEE , pp. 4520-4524
- Li, X.¹ Wu, X.²

7
- 84910027886
- A comparative analytic study on the Gaussian mixture and context dependent deep neural network hidden Markov models
- Yan Huang, Dong Yu, Chaojun Liu, and Yifan Gong, "A comparative analytic study on the Gaussian mixture and context dependent deep neural network hidden Markov models, " in INTERSPEECH, 2014.
- (2014) INTERSPEECH
- Huang, Y.¹ Yu, D.² Liu, C.³ Gong, Y.⁴

8
- 84923929378
- Springer
- Yu Dong and Deng Li, Automatic speech recognition, A deep learning approach, Springer.
- Automatic Speech Recognition, A Deep Learning Approach
- Dong, Y.¹ Li, D.²

9
- 84858976070
- Feature engineering in context-dependent deep neural networks for conversational speech transcription
- Frank Seide, Gang Li, Xie Chen, and Dong Yu, "Feature engineering in context-dependent deep neural networks for conversational speech transcription, " in ASRU. IEEE, 2011, pp. 24-29.
- (2011) ASRU. IEEE , pp. 24-29
- Seide, F.¹ Li, G.² Chen, X.³ Yu, D.⁴

10
- 84874226579
- Adaptation of context-dependent deep neural networks for automatic speech recognition
- Kaisheng Yao, Dong Yu, Frank Seide, Hang Su, Li Deng, and Yifan Gong, "Adaptation of context-dependent deep neural networks for automatic speech recognition., " in SLT, 2012, pp. 366-369.
- (2012) SLT , pp. 366-369
- Yao, K.¹ Yu, D.² Seide, F.³ Su, H.⁴ Deng, L.⁵ Gong, Y.⁶

11
- 84890542079
- Kl-divergence regularized deep neural network adaptation for improved large vocabulary speech recognition
- Dong Yu, Kaisheng Yao, Hang Su, Gang Li, and Frank Seide, "Kl-divergence regularized deep neural network adaptation for improved large vocabulary speech recognition, " in ICASSP. IEEE, 2013, pp. 7893-7897.
- (2013) ICASSP. IEEE , pp. 7893-7897
- Yu, D.¹ Yao, K.² Su, H.³ Li, G.⁴ Seide, F.⁵

12
- 84890452886
- Fast speaker adaptation of hybrid nn/hmm model for speech recognition based on discriminative learning of speaker code
- Ossama Abdel-Hamid and Hui Jiang, "Fast speaker adaptation of hybrid nn/hmm model for speech recognition based on discriminative learning of speaker code, " in ICASSP. IEEE, 2013, pp. 7942-7946.
- (2013) ICASSP. IEEE , pp. 7942-7946
- Abdel-Hamid, O.¹ Jiang, H.²

13
- 84893691530
- Speaker adaptation of neural network acoustic models using i-vectors
- George Saon, Hagen Soltau, David Nahamoo, and Michael Picheny, "Speaker adaptation of neural network acoustic models using i-vectors, " in ASRU. IEEE, 2013, pp. 55-59.
- (2013) ASRU. IEEE , pp. 55-59
- Saon, G.¹ Soltau, H.² Nahamoo, D.³ Picheny, M.⁴

14
- 84946692024
- Vocal tract length normalisation approaches to dnn-based children's and adults' speech recognition
- Romain Serizel and Diego Giuliani, "Vocal tract length normalisation approaches to dnn-based children's and adults' speech recognition, " in Spoken Language Technology Workshop (SLT), 2014 IEEE. IEEE, 2014, pp. 135-140.
- (2014) Spoken Language Technology Workshop (SLT), 2014 IEEE. IEEE , pp. 135-140
- Serizel, R.¹ Giuliani, D.²

15
- 84938725974
- On speaker adaptation of long short-term memory recurrent neural networks
- Yajie Miao and Florian Metze, "On speaker adaptation of long short-term memory recurrent neural networks, " in INTERSPEECH, 2015.
- (2015) INTERSPEECH
- Miao, Y.¹ Metze, F.²

16
- 79951609039
- Front-end factor analysis for speaker verification
- Najim Dehak, Patrick Kenny, Réda Dehak, Pierre Dumouchel, and Pierre Ouellet, "Front-end factor analysis for speaker verification, " Audio, Speech, and Language Processing, IEEE Transactions on, vol. 19, no. 4, pp. 788-798, 2011.
- (2011) Audio, Speech, and Language Processing, IEEE Transactions on , vol.19 , Issue.4 , pp. 788-798
- Dehak, N.¹ Kenny, P.² Dehak, R.³ Dumouchel, P.⁴ Ouellet, P.⁵

17
- 84946035423
- An investigation of augmenting speaker representations to improve speaker normalisation for dnn-based speech recognition
- Hengguan Huang and Khe Chai Sim, "An investigation of augmenting speaker representations to improve speaker normalisation for dnn-based speech recognition, " in ICASSP. IEEE, 2015, pp. 4610-4613.
- (2015) ICASSP. IEEE , pp. 4610-4613
- Huang, H.¹ Chai Sim, K.²

18
- 84959155988
- Recurrent neural network language model adaptation for multi-genre broadcast speech recognition
- Xie Chen, Tian Tan, Xunying Liu, Pierre Lanchantin, Moquan Wan, Mark Gales, and PhilWoodland, "Recurrent neural network language model adaptation for multi-genre broadcast speech recognition, " in INTERSPEECH, 2015.
- (2015) INTERSPEECH
- Chen, X.¹ Tan, T.² Liu, X.³ Lanchantin, P.⁴ Wan, M.⁵ Gales, M.⁶ Woodland, P.⁷

19
- 84874235486
- Context dependent recurrent neural network language model
- Tomas Mikolov and Geoffrey Zweig, "Context dependent recurrent neural network language model., " in SLT, 2012, pp. 234-239.
- (2012) SLT , pp. 234-239
- Mikolov, T.¹ Zweig, G.²

20
- 84910031119
- Towards speaker adaptive training of deep neural network acoustic models
- Yajie Miao, Hao Zhang, and Florian Metze, "Towards speaker adaptive training of deep neural network acoustic models, " in INTERSPEECH, 2014.
- (2014) INTERSPEECH
- Miao, Y.¹ Zhang, H.² Metze, F.³

21
- 84905252132
- A novel scheme for speaker recognition using a phonetically-aware deep neural network
- Yun Lei, Nicolas Scheffer, Luciana Ferrer, and Moray McLaren, "A novel scheme for speaker recognition using a phonetically-aware deep neural network, " in ICASSP. IEEE, 2014, pp. 1695-1699.
- (2014) ICASSP. IEEE , pp. 1695-1699
- Lei, Y.¹ Scheffer, N.² Ferrer, L.³ McLaren, M.⁴

22
- 84905283791
- Joint acoustic modeling of triphones and trigraphemes by multi-task learning deep neural networks for low-resource speech recognition
- Dongpeng Chen, Brian Mak, Cheung-Chi Leung, and Sunil Sivadas, "Joint acoustic modeling of triphones and trigraphemes by multi-task learning deep neural networks for low-resource speech recognition, " in ICASSP. IEEE, 2014, pp. 5592-5596.
- (2014) ICASSP. IEEE , pp. 5592-5596
- Chen, D.¹ Mak, B.² Leung, C.³ Sivadas, S.⁴

23
- 84959157262
- Multi-task learning for text-dependent speaker verification
- Nanxin Chen, Yanmin Qian, and Kai Yu, "Multi-task learning for text-dependent speaker verification, " in INTERSPEECH, 2015.
- (2015) INTERSPEECH
- Chen, N.¹ Qian, Y.² Yu, K.³

24
- 0028996973
- On the effects of speech rate in large vocabulary speech recognition systems
- Matthew Siegler, Richard M Stern, et al., "On the effects of speech rate in large vocabulary speech recognition systems, " in ICASSP. IEEE, 1995, vol. 1, pp. 612-615.
- (1995) ICASSP. IEEE , vol.1 , pp. 612-615
- Siegler, M.¹ Stern, R.M.²

25
- 85135173867
- Speech recognition using on-line estimation of speaking rate
- Nelson Morgan, Eric Fosler-Lussier, and Nikki Mirghafori, "Speech recognition using on-line estimation of speaking rate., " in Eurospeech, 1997, vol. 97, pp. 2079-2082.
- (1997) Eurospeech , vol.97 , pp. 2079-2082
- Morgan, N.¹ Fosler-Lussier, E.² Mirghafori, N.³

26
- 78049388975
- Speaking rate adaptation using continuous frame rate normalization
- Stephen M Chu and Daniel Povey, "Speaking rate adaptation using continuous frame rate normalization, " in ICASSP. IEEE, 2010, pp. 4306-4309.
- (2010) ICASSP. IEEE , pp. 4306-4309
- Chu, S.M.¹ Povey, D.²

27
- 84959148444
- Learning speech rate in speech recognition
- Xiangyu Zeng, Shi Yin, and Dong Wang, "Learning speech rate in speech recognition, " in INTERSPEECH, 2015, pp. 528-532.
- (2015) INTERSPEECH , pp. 528-532
- Zeng, X.¹ Yin, S.² Wang, D.³

28
- 85008520364
- Transcribing meetings with the amida systems
- Thomas Hain, Luká Burget, John Dines, Philip N Garner, Frantisek Grézl, Asmaa El Hannani, Marijn Huijbregts, Martin Karafiat, Mike Lincoln, and VincentWan, "Transcribing meetings with the amida systems, " Audio, Speech, and Language Processing, IEEE Transactions on, vol. 20, no. 2, pp. 486-498, 2012.
- (2012) Audio, Speech, and Language Processing, IEEE Transactions on , vol.20 , Issue.2 , pp. 486-498
- Hain, T.¹ Burget, L.² Dines, J.³ Garner, P.N.⁴ Grézl, F.⁵ El Hannani, A.⁶ Huijbregts, M.⁷ Karafiat, M.⁸ Lincoln, M.⁹ Wan, V.¹⁰

29
- 84893704659
- Hybrid acoustic models for distant and multichannel large vocabulary speech recognition
- Pawel Swietojanski, Arnab Ghoshal, and Steve Renals, "Hybrid acoustic models for distant and multichannel large vocabulary speech recognition, " in ASRU. IEEE, 2013, pp. 285-290.
- (2013) ASRU. IEEE , pp. 285-290
- Swietojanski, P.¹ Ghoshal, A.² Renals, S.³

30
- 84893696682
- Daniel Povey, Arnab Ghoshal, Gilles Boulianne, Lukás Burget, Ondrej Glembek, Nagendra Goel, Mirko Hannemann, Petr Motlícek, Yanmin Qian, Petr Schwarz, et al., "The kaldi speech recognition toolkit, " 2011.
- (2011) The Kaldi Speech Recognition Toolkit
- Povey, D.¹ Ghoshal, A.² Boulianne, G.³ Burget, L.⁴ Glembek, O.⁵ Goel, N.⁶ Hannemann, M.⁷ Motlícek, P.⁸ Qian, Y.⁹ Schwarz, P.¹⁰

31
- 84928146953
- Tech. Rep., Microsoft Research
- Dong Yu, Adam Eversole, Mike Seltzer, Kaisheng Yao, Zhiheng Huang, Brian Guenter, Oleksii Kuchaiev, Yu Zhang, Frank Seide, Huaming Wang, et al., "An introduction to computational networks and the computational network toolkit, " Tech. Rep., Microsoft Research, 2014.
- (2014) An Introduction to Computational Networks and the Computational Network Toolkit
- Yu, D.¹ Eversole, A.² Seltzer, M.³ Yao, K.⁴ Huang, Z.⁵ Guenter, B.⁶ Kuchaiev, O.⁷ Zhang, Y.⁸ Seide, F.⁹ Wang, H.¹⁰

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.