SCOPUS 정보 검색 플랫폼

Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH

Volumn 2015-January, Issue , 2015, Pages 3630-3634

FMLLR based feature-space speaker adaptation of DNN acoustic models

(6) Parthasarathi, Sree Hari Krishnan a Hoffmeister, Bjorn a Matsoukas, Spyros a Mandal, Arindam a Strom, Nikko a Garimella, Sri a

a AMAZON (United States)

Author keywords

DNN acoustic models; Feature space speaker adaptation; Speech recognition

Indexed keywords

MATHEMATICAL TRANSFORMATIONS; MAXIMUM LIKELIHOOD; MAXIMUM LIKELIHOOD ESTIMATION; SPEECH COMMUNICATION;

ACOUSTIC MODEL; EARLY FUSION; FEATURE SPACE; MAXIMUM LIKELIHOOD LINEAR REGRESSION; SIDE INFORMATION; SPEAKER ADAPTATION; TARGET MODEL; UNSUPERVISED ADAPTATION;

SPEECH RECOGNITION;

EID: 84959134917 PISSN: 2308457X EISSN: 19909772 Source Type: Conference Proceeding
DOI: None Document Type: Conference Paper

Times cited : (55)

References (21)

1
- 85032751458
- Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups
- G. Hinton, L. Deng, D. Yu, G. Dahl, A. R. Mohamed, N. Jaitly, A. Senior, V. Vanhoucke, P. Nguyen, T. N. Sainath, and B. Kingsbury, "Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups, " IEEE Signal Processing Magazine, vol. 29, pp. 82-97, 2012.
- (2012) IEEE Signal Processing Magazine , vol.29 , pp. 82-97
- Hinton, G.¹ Deng, L.² Yu, D.³ Dahl, G.⁴ Mohamed, A.R.⁵ Jaitly, N.⁶ Senior, A.⁷ Vanhoucke, V.⁸ Nguyen, P.⁹ Sainath, T.N.¹⁰ Kingsbury, B.¹¹

2
- 0003573244
- Kluwer Academic Publishers
- H. Bourlard and N. Morgan, Connectionist speech recognition-a hybrid approach. Kluwer Academic Publishers, 1994.
- (1994) Connectionist Speech Recognition-a Hybrid Approach
- Bourlard, H.¹ Morgan, N.²

3
- 0033709098
- Tandem connectionist feature extraction for conventionalHMMsystems
- H. Hermansky, D. Ellis, and S. Sharma, "Tandem connectionist feature extraction for conventionalHMMsystems, " in In Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing, 2000.
- (2000) Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing
- Hermansky, H.¹ Ellis, D.² Sharma, S.³

4
- 84858976070
- Feature engineering in context dependent deep neural networks for conversational speech transcription
- F. Seide, G. Li, X. Chen, and D. Yu, "Feature engineering in context dependent deep neural networks for conversational speech transcription, " in In Proceedings of IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU), 2011.
- (2011) Proceedings of IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU)
- Seide, F.¹ Li, G.² Chen, X.³ Yu, D.⁴

5
- 84893691530
- Speaker adaptation of neural network acoustic models using i-vectors
- G. Saon, H. Soltau, D. Nahamoo, and M. Picheny, "Speaker adaptation of neural network acoustic models using i-vectors, " in In Proceedings of IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU), 2013.
- (2013) Proceedings of IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU)
- Saon, G.¹ Soltau, H.² Nahamoo, D.³ Picheny, M.⁴

6
- 84886829539
- Optimization techniques to improve training speed of deep neural networks for large speech tasks
- T. Sainath, B. Kingsbury, H. Soltau, and B. Ramabhadran, "Optimization techniques to improve training speed of deep neural networks for large speech tasks, " IEEE Transactions on Audio, Speech, and Language Processing, vol. 21, pp. 2267-2276, 2013.
- (2013) IEEE Transactions on Audio, Speech, and Language Processing , vol.21 , pp. 2267-2276
- Sainath, T.¹ Kingsbury, B.² Soltau, H.³ Ramabhadran, B.⁴

7
- 84946692024
- Vocal tract length normalisation approaches to DNN-based children's and adults' speech recognition
- R. Serizel and D. Giuliani, "Vocal tract length normalisation approaches to DNN-based children's and adults' speech recognition, " in In Proceedings of IEEE Spoken Language Technology (SLT) Workshop, 2014.
- (2014) Proceedings of IEEE Spoken Language Technology (SLT) Workshop
- Serizel, R.¹ Giuliani, D.²

8
- 84921731072
- Fast adaptation of deep neural network based on discriminant codes for speech recognition
- S. Xue, O. Abdel-Hamid, H. Jiang, L. Dai, and Q. Liu, "Fast adaptation of deep neural network based on discriminant codes for speech recognition, " ACM/IEEE Transactions on Audio, Speech, and Language Processing, vol. 22, pp. 1713-1725, 2014.
- (2014) ACM/IEEE Transactions on Audio, Speech, and Language Processing , vol.22 , pp. 1713-1725
- Xue, S.¹ Abdel-Hamid, O.² Jiang, H.³ Dai, L.⁴ Liu, Q.⁵

9
- 84874226579
- Adaptation of context-dependent deep neural networks for automatic speech recognition
- K. Yao, D. Yu, F. Seide, H. Su, L. Deng, and Y. Gong, "Adaptation of context-dependent deep neural networks for automatic speech recognition, " in In Proceedings of IEEE Spoken Language Technology (SLT) Workshop, 2012.
- (2012) Proceedings of IEEE Spoken Language Technology (SLT) Workshop
- Yao, K.¹ Yu, D.² Seide, F.³ Su, H.⁴ Deng, L.⁵ Gong, Y.⁶

10
- 84890521103
- Speaker adaptation of context dependent deep neural networks
- H. Liao, "Speaker adaptation of context dependent deep neural networks, " in In Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing, 2013.
- (2013) Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing
- Liao, H.¹

11
- 84983119674
- Learning hidden unit contributions for unsupervised speaker adaptation of neural network acoustic models
- P. Swietojanski and S. Renals, "Learning hidden unit contributions for unsupervised speaker adaptation of neural network acoustic models, " in In Proceedings of IEEE Spoken Language Technology (SLT) Workshop, 2014.
- (2014) Proceedings of IEEE Spoken Language Technology (SLT) Workshop
- Swietojanski, P.¹ Renals, S.²

12
- 84959142471
- Robust i-vector based adaptation of DNN acoustic model for speech recognition
- S. Garimella, A. Mandal, N. Strom, B. Hoffmeister, S. Matsoukas, and S. H. K. Parthasarathi, "Robust i-vector based adaptation of DNN acoustic model for speech recognition, " in In Proceedings of Interspeech, 2015.
- (2015) Proceedings of Interspeech
- Garimella, S.¹ Mandal, A.² Strom, N.³ Hoffmeister, B.⁴ Matsoukas, S.⁵ Parthasarathi, S.H.K.⁶

13
- 84905259138
- Improving DNN speaker independence with i-vector inputs
- A. Senior and I. Lopez-Moreno, "Improving DNN speaker independence with i-vector inputs, " in In Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing, 2014.
- (2014) Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing
- Senior, A.¹ Lopez-Moreno, I.²

14
- 84937508363
- How transferable are features in deep neural networks
- Y. J, J. Clune, Y. Bengio, and H. Lipson, "How transferable are features in deep neural networks" in In Proceedings of Advances in Neural Information Processing Systems 27, 2014.
- (2014) Proceedings of Advances in Neural Information Processing Systems , vol.27
- Clune, Y.J.J.¹ Bengio, Y.² Lipson, H.³

15
- 0029288633
- Maximum likelihood linear regression for speaker adaptation of continuous density hidden markov models
- C. J. Leggetter and P. C. Woodland, "Maximum likelihood linear regression for speaker adaptation of continuous density hidden markov models, " Computer Speech and Language, no. 9, 1995.
- (1995) Computer Speech and Language , Issue.9
- Leggetter, C.J.¹ Woodland, P.C.²

16
- 0030263447
- Mean and variance adaptation within the MLLR framework
- M. J. F. Gales and P. C. Woodland, "Mean and variance adaptation within the MLLR framework, " Computer Speech and Language, vol. 10, pp. 249-264, 1996.
- (1996) Computer Speech and Language , vol.10 , pp. 249-264
- Gales, M.J.F.¹ Woodland, P.C.²

17
- 0031647824
- A frequency warping approach to speaker normalization
- L. Lee and R. Rose, "A frequency warping approach to speaker normalization, " IEEE Transactions on Audio, Speech, and Language Processing, vol. 6, pp. 49-60, 1998.
- (1998) IEEE Transactions on Audio, Speech, and Language Processing , vol.6 , pp. 49-60
- Lee, L.¹ Rose, R.²

18
- 0029764708
- Speaker normalization on conversational telephone speech
- S. Wegmann, D. McAllaster, J. Orloff, and B. Peskin, "Speaker normalization on conversational telephone speech, " in In Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing, 1996.
- (1996) Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing
- Wegmann, S.¹ McAllaster, D.² Orloff, J.³ Peskin, B.⁴

19
- 84890465724
- The blame game in meeting room ASR: An analysis of feature versus model errors in noisy and mismatched conditions
- S. H. K. Parthasarathi, S. Y. Chang, J. Cohen, N. Morgan, and S. Wegmann, "The blame game in meeting room ASR: An analysis of feature versus model errors in noisy and mismatched conditions, " in In Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing, 2013.
- (2013) Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing
- Parthasarathi, S.H.K.¹ Chang, S.Y.² Cohen, J.³ Morgan, N.⁴ Wegmann, S.⁵

20
- 33646759965
- Adaptive training using simple target models
- G. Stemmer, F. Brugnara, and D. Giuliani, "Adaptive training using simple target models, " in In Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005.
- (2005) Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing
- Stemmer, G.¹ Brugnara, F.² Giuliani, D.³

21
- 0032021555
- On combining classifiers
- J. Kittler, M. Hatef, R. P. W. Duin, and J. Matas, "On combining classifiers. " IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 20, pp. 226-239, 1998.
- (1998) IEEE Transactions on Pattern Analysis and Machine Intelligence , vol.20 , pp. 226-239
- Kittler, J.¹ Hatef, M.² Duin, R.P.W.³ Matas, J.⁴

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.