메뉴 건너뛰기




Volumn 2016-May, Issue , 2016, Pages 5280-5284

Speaker-aware training of LSTM-RNNS for acoustic modelling

Author keywords

i vector; LSTM RNNs; speaker adaptation; speaker aware training; speaking rate

Indexed keywords


EID: 84973380342     PISSN: 15206149     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1109/ICASSP.2016.7472685     Document Type: Conference Paper
Times cited : (49)

References (31)
  • 3
    • 0028392483 scopus 로고
    • Learning long-term dependencies with gradient descent is difficult
    • Yoshua Bengio, Patrice Simard, and Paolo Frasconi, "Learning long-term dependencies with gradient descent is difficult, " Neural Networks, IEEE Transactions on, vol. 5, no. 2, pp. 157-166, 1994.
    • (1994) Neural Networks, IEEE Transactions on , vol.5 , Issue.2 , pp. 157-166
    • Bengio, Y.1    Simard, P.2    Frasconi, P.3
  • 4
    • 0031573117 scopus 로고    scopus 로고
    • Long short-term memory
    • Sepp Hochreiter and Jürgen Schmidhuber, "Long short-term memory, " Neural computation, vol. 9, no. 8, pp. 1735-1780, 1997.
    • (1997) Neural Computation , vol.9 , Issue.8 , pp. 1735-1780
    • Hochreiter, S.1    Schmidhuber, J.2
  • 5
    • 84910046405 scopus 로고    scopus 로고
    • Long short-term memory recurrent neural network architectures for large scale acoustic modeling
    • Hasim Sak, Andrew Senior, and Françoise Beaufays, "Long short-term memory recurrent neural network architectures for large scale acoustic modeling, " in INTERSPEECH, 2014.
    • (2014) INTERSPEECH
    • Sak, H.1    Senior, A.2    Beaufays, F.3
  • 6
    • 84946083498 scopus 로고    scopus 로고
    • Constructing long short-term memory based deep recurrent neural networks for large vocabulary speech recognition
    • Xiangang Li and Xihong Wu, "Constructing long short-term memory based deep recurrent neural networks for large vocabulary speech recognition, " in ICASSP. IEEE, 2015, pp. 4520-4524.
    • (2015) ICASSP. IEEE , pp. 4520-4524
    • Li, X.1    Wu, X.2
  • 7
    • 84910027886 scopus 로고    scopus 로고
    • A comparative analytic study on the Gaussian mixture and context dependent deep neural network hidden Markov models
    • Yan Huang, Dong Yu, Chaojun Liu, and Yifan Gong, "A comparative analytic study on the Gaussian mixture and context dependent deep neural network hidden Markov models, " in INTERSPEECH, 2014.
    • (2014) INTERSPEECH
    • Huang, Y.1    Yu, D.2    Liu, C.3    Gong, Y.4
  • 9
    • 84858976070 scopus 로고    scopus 로고
    • Feature engineering in context-dependent deep neural networks for conversational speech transcription
    • Frank Seide, Gang Li, Xie Chen, and Dong Yu, "Feature engineering in context-dependent deep neural networks for conversational speech transcription, " in ASRU. IEEE, 2011, pp. 24-29.
    • (2011) ASRU. IEEE , pp. 24-29
    • Seide, F.1    Li, G.2    Chen, X.3    Yu, D.4
  • 10
    • 84874226579 scopus 로고    scopus 로고
    • Adaptation of context-dependent deep neural networks for automatic speech recognition
    • Kaisheng Yao, Dong Yu, Frank Seide, Hang Su, Li Deng, and Yifan Gong, "Adaptation of context-dependent deep neural networks for automatic speech recognition., " in SLT, 2012, pp. 366-369.
    • (2012) SLT , pp. 366-369
    • Yao, K.1    Yu, D.2    Seide, F.3    Su, H.4    Deng, L.5    Gong, Y.6
  • 11
    • 84890542079 scopus 로고    scopus 로고
    • Kl-divergence regularized deep neural network adaptation for improved large vocabulary speech recognition
    • Dong Yu, Kaisheng Yao, Hang Su, Gang Li, and Frank Seide, "Kl-divergence regularized deep neural network adaptation for improved large vocabulary speech recognition, " in ICASSP. IEEE, 2013, pp. 7893-7897.
    • (2013) ICASSP. IEEE , pp. 7893-7897
    • Yu, D.1    Yao, K.2    Su, H.3    Li, G.4    Seide, F.5
  • 12
    • 84890452886 scopus 로고    scopus 로고
    • Fast speaker adaptation of hybrid nn/hmm model for speech recognition based on discriminative learning of speaker code
    • Ossama Abdel-Hamid and Hui Jiang, "Fast speaker adaptation of hybrid nn/hmm model for speech recognition based on discriminative learning of speaker code, " in ICASSP. IEEE, 2013, pp. 7942-7946.
    • (2013) ICASSP. IEEE , pp. 7942-7946
    • Abdel-Hamid, O.1    Jiang, H.2
  • 13
    • 84893691530 scopus 로고    scopus 로고
    • Speaker adaptation of neural network acoustic models using i-vectors
    • George Saon, Hagen Soltau, David Nahamoo, and Michael Picheny, "Speaker adaptation of neural network acoustic models using i-vectors, " in ASRU. IEEE, 2013, pp. 55-59.
    • (2013) ASRU. IEEE , pp. 55-59
    • Saon, G.1    Soltau, H.2    Nahamoo, D.3    Picheny, M.4
  • 14
    • 84946692024 scopus 로고    scopus 로고
    • Vocal tract length normalisation approaches to dnn-based children's and adults' speech recognition
    • Romain Serizel and Diego Giuliani, "Vocal tract length normalisation approaches to dnn-based children's and adults' speech recognition, " in Spoken Language Technology Workshop (SLT), 2014 IEEE. IEEE, 2014, pp. 135-140.
    • (2014) Spoken Language Technology Workshop (SLT), 2014 IEEE. IEEE , pp. 135-140
    • Serizel, R.1    Giuliani, D.2
  • 15
    • 84938725974 scopus 로고    scopus 로고
    • On speaker adaptation of long short-term memory recurrent neural networks
    • Yajie Miao and Florian Metze, "On speaker adaptation of long short-term memory recurrent neural networks, " in INTERSPEECH, 2015.
    • (2015) INTERSPEECH
    • Miao, Y.1    Metze, F.2
  • 17
    • 84946035423 scopus 로고    scopus 로고
    • An investigation of augmenting speaker representations to improve speaker normalisation for dnn-based speech recognition
    • Hengguan Huang and Khe Chai Sim, "An investigation of augmenting speaker representations to improve speaker normalisation for dnn-based speech recognition, " in ICASSP. IEEE, 2015, pp. 4610-4613.
    • (2015) ICASSP. IEEE , pp. 4610-4613
    • Huang, H.1    Chai Sim, K.2
  • 18
    • 84959155988 scopus 로고    scopus 로고
    • Recurrent neural network language model adaptation for multi-genre broadcast speech recognition
    • Xie Chen, Tian Tan, Xunying Liu, Pierre Lanchantin, Moquan Wan, Mark Gales, and PhilWoodland, "Recurrent neural network language model adaptation for multi-genre broadcast speech recognition, " in INTERSPEECH, 2015.
    • (2015) INTERSPEECH
    • Chen, X.1    Tan, T.2    Liu, X.3    Lanchantin, P.4    Wan, M.5    Gales, M.6    Woodland, P.7
  • 19
    • 84874235486 scopus 로고    scopus 로고
    • Context dependent recurrent neural network language model
    • Tomas Mikolov and Geoffrey Zweig, "Context dependent recurrent neural network language model., " in SLT, 2012, pp. 234-239.
    • (2012) SLT , pp. 234-239
    • Mikolov, T.1    Zweig, G.2
  • 20
    • 84910031119 scopus 로고    scopus 로고
    • Towards speaker adaptive training of deep neural network acoustic models
    • Yajie Miao, Hao Zhang, and Florian Metze, "Towards speaker adaptive training of deep neural network acoustic models, " in INTERSPEECH, 2014.
    • (2014) INTERSPEECH
    • Miao, Y.1    Zhang, H.2    Metze, F.3
  • 21
    • 84905252132 scopus 로고    scopus 로고
    • A novel scheme for speaker recognition using a phonetically-aware deep neural network
    • Yun Lei, Nicolas Scheffer, Luciana Ferrer, and Moray McLaren, "A novel scheme for speaker recognition using a phonetically-aware deep neural network, " in ICASSP. IEEE, 2014, pp. 1695-1699.
    • (2014) ICASSP. IEEE , pp. 1695-1699
    • Lei, Y.1    Scheffer, N.2    Ferrer, L.3    McLaren, M.4
  • 22
    • 84905283791 scopus 로고    scopus 로고
    • Joint acoustic modeling of triphones and trigraphemes by multi-task learning deep neural networks for low-resource speech recognition
    • Dongpeng Chen, Brian Mak, Cheung-Chi Leung, and Sunil Sivadas, "Joint acoustic modeling of triphones and trigraphemes by multi-task learning deep neural networks for low-resource speech recognition, " in ICASSP. IEEE, 2014, pp. 5592-5596.
    • (2014) ICASSP. IEEE , pp. 5592-5596
    • Chen, D.1    Mak, B.2    Leung, C.3    Sivadas, S.4
  • 23
    • 84959157262 scopus 로고    scopus 로고
    • Multi-task learning for text-dependent speaker verification
    • Nanxin Chen, Yanmin Qian, and Kai Yu, "Multi-task learning for text-dependent speaker verification, " in INTERSPEECH, 2015.
    • (2015) INTERSPEECH
    • Chen, N.1    Qian, Y.2    Yu, K.3
  • 24
    • 0028996973 scopus 로고
    • On the effects of speech rate in large vocabulary speech recognition systems
    • Matthew Siegler, Richard M Stern, et al., "On the effects of speech rate in large vocabulary speech recognition systems, " in ICASSP. IEEE, 1995, vol. 1, pp. 612-615.
    • (1995) ICASSP. IEEE , vol.1 , pp. 612-615
    • Siegler, M.1    Stern, R.M.2
  • 25
    • 85135173867 scopus 로고    scopus 로고
    • Speech recognition using on-line estimation of speaking rate
    • Nelson Morgan, Eric Fosler-Lussier, and Nikki Mirghafori, "Speech recognition using on-line estimation of speaking rate., " in Eurospeech, 1997, vol. 97, pp. 2079-2082.
    • (1997) Eurospeech , vol.97 , pp. 2079-2082
    • Morgan, N.1    Fosler-Lussier, E.2    Mirghafori, N.3
  • 26
    • 78049388975 scopus 로고    scopus 로고
    • Speaking rate adaptation using continuous frame rate normalization
    • Stephen M Chu and Daniel Povey, "Speaking rate adaptation using continuous frame rate normalization, " in ICASSP. IEEE, 2010, pp. 4306-4309.
    • (2010) ICASSP. IEEE , pp. 4306-4309
    • Chu, S.M.1    Povey, D.2
  • 27
    • 84959148444 scopus 로고    scopus 로고
    • Learning speech rate in speech recognition
    • Xiangyu Zeng, Shi Yin, and Dong Wang, "Learning speech rate in speech recognition, " in INTERSPEECH, 2015, pp. 528-532.
    • (2015) INTERSPEECH , pp. 528-532
    • Zeng, X.1    Yin, S.2    Wang, D.3
  • 29
    • 84893704659 scopus 로고    scopus 로고
    • Hybrid acoustic models for distant and multichannel large vocabulary speech recognition
    • Pawel Swietojanski, Arnab Ghoshal, and Steve Renals, "Hybrid acoustic models for distant and multichannel large vocabulary speech recognition, " in ASRU. IEEE, 2013, pp. 285-290.
    • (2013) ASRU. IEEE , pp. 285-290
    • Swietojanski, P.1    Ghoshal, A.2    Renals, S.3


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.