메뉴 건너뛰기




Volumn 2015-August, Issue , 2015, Pages 4310-4314

Investigating online low-footprint speaker adaptation using generalized linear regression and click-through data

Author keywords

automatic speech recognition; deep neural network; low footprint; speaker adaptation

Indexed keywords

AUDIO SIGNAL PROCESSING; CHEMICAL ACTIVATION; DIGITAL STORAGE; LINEAR TRANSFORMATIONS; MATHEMATICAL TRANSFORMATIONS; SINGULAR VALUE DECOMPOSITION; SPEECH COMMUNICATION; SPEECH RECOGNITION;

EID: 84946061232     PISSN: 15206149     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1109/ICASSP.2015.7178784     Document Type: Conference Paper
Times cited : (32)

References (27)
  • 1
    • 84055222005 scopus 로고    scopus 로고
    • Context-dependent pre-trained deep neural networks for large-vocabulary speech recognition
    • G. E. Dahl, D. Yu, L. Deng, and A. Acero, Context-dependent pre-trained deep neural networks for large-vocabulary speech recognition, IEEE Trans. Audio, Speech, Lang. Process., vol. 20, no. 1, pp. 30-42, 2012
    • (2012) IEEE Trans. Audio, Speech, Lang. Process , vol.20 , Issue.1 , pp. 30-42
    • Dahl, G.E.1    Yu, D.2    Deng, L.3    Acero, A.4
  • 2
    • 84878539964 scopus 로고    scopus 로고
    • Application of pretrained deep neural networks to large vocabulary speech recognition
    • N. Jaitly, P. Nguyen, A. Senior, and V. Vanhoucke, Application of pretrained deep neural networks to large vocabulary speech recognition, in Proc. Interspeech, 2012, pp. 2578-2581
    • (2012) Proc. Interspeech , pp. 2578-2581
    • Jaitly, N.1    Nguyen, P.2    Senior, A.3    Vanhoucke, V.4
  • 3
    • 84890491198 scopus 로고    scopus 로고
    • Recent advances in deep learning for speech research at Microsoft
    • L. Deng, J. Li, J.-T. Huang, et al., Recent advances in deep learning for speech research at Microsoft, in Proc. ICASSP, 2013, pp. 8604-8608
    • (2013) Proc. ICASSP , pp. 8604-8608
    • Deng, L.1    Li, J.2    Huang, J.-T.3
  • 4
    • 84858972572 scopus 로고    scopus 로고
    • Making deep belief networks effective for large vocabulary continuous speech recognition
    • T. N. Sainath, B. Kingsbury, B. Ramabhadran, P. Fousek, P. Novak, and A. Mohamed, Making deep belief networks effective for large vocabulary continuous speech recognition, in Proc. ASRU, 2011, pp. 30-35
    • (2011) Proc. ASRU , pp. 30-35
    • Sainath, T.N.1    Kingsbury, B.2    Ramabhadran, B.3    Fousek, P.4    Novak, P.5    Mohamed, A.6
  • 5
    • 84937880519 scopus 로고
    • Connectionist speaker normalization and adaptation
    • V. Abrash, H. Franco, A. Sankar, and M. Cohen, Connectionist speaker normalization and adaptation, in Proc. Eurospeech, 1995, pp. 2183-2186
    • (1995) Proc. Eurospeech , pp. 2183-2186
    • Abrash, V.1    Franco, H.2    Sankar, A.3    Cohen, M.4
  • 7
    • 79959849500 scopus 로고    scopus 로고
    • Comparison of discriminative input and output transformations for speaker adaptation in the hybrid NN/HMM systems
    • B. Li and K. C. Sim, Comparison of discriminative input and output transformations for speaker adaptation in the hybrid NN/HMM systems, in Proc. Interspeech, 2010, pp. 526-529
    • (2010) Proc. Interspeech , pp. 526-529
    • Li, B.1    Sim, K.C.2
  • 8
    • 84865740155 scopus 로고    scopus 로고
    • Improving LVCSR system combination using neural network language model cross adaptation
    • X. Liu, M. J. F. Gales, and P. C. Woodland, Improving LVCSR system combination using neural network language model cross adaptation, in Proc. Interspeech, 2011, pp. 2857-2860
    • (2011) Proc. Interspeech , pp. 2857-2860
    • Liu, X.1    Gales, M.J.F.2    Woodland, P.C.3
  • 9
    • 84858976070 scopus 로고    scopus 로고
    • Feature engineering in context-dependent deep neural networks for conversational speech transcription
    • F. Seide, G. Li, X. Chen, and D. Yu, Feature engineering in context-dependent deep neural networks for conversational speech transcription, in Proc. ASRU, 2011, pp. 24-29
    • (2011) Proc. ASRU , pp. 24-29
    • Seide, F.1    Li, G.2    Chen, X.3    Yu, D.4
  • 10
    • 0032050110 scopus 로고    scopus 로고
    • Maximum likelihood linear transformations for HMM-based speech recognition
    • Jan
    • M. J. F. Gales, Maximum likelihood linear transformations for HMM-based speech recognition, Comput. Speech Lang., vol. 12, no. 2, pp. 75-98, Jan. 1998
    • (1998) Comput. Speech Lang , vol.12 , Issue.2 , pp. 75-98
    • Gales, M.J.F.1
  • 11
    • 34548012893 scopus 로고    scopus 로고
    • Linear hidden transformations for adaptation of hybrid ANN/HMM models
    • R. Gemello, F. Mana, S. Scanzio, P. Laface, and R. De Mori, Linear hidden transformations for adaptation of hybrid ANN/HMM models, Speech Commun., vol. 49, no. 10, pp. 827-835, 2007
    • (2007) Speech Commun , vol.49 , Issue.10 , pp. 827-835
    • Gemello, R.1    Mana, F.2    Scanzio, S.3    Laface, P.4    De Mori, R.5
  • 12
    • 84874226579 scopus 로고    scopus 로고
    • Adaptation of context-dependent deep neural networks for automatic speech recognition
    • K. Yao, D. Yu, F. Seide, H. Su, L. Deng, and Y. Gong, Adaptation of context-dependent deep neural networks for automatic speech recognition., in Proc. SLT, 2012, pp. 366-369
    • (2012) Proc. SLT , pp. 366-369
    • Yao, K.1    Yu, D.2    Seide, F.3    Su, H.4    Deng, L.5    Gong, Y.6
  • 13
    • 84905262902 scopus 로고    scopus 로고
    • Factorized adaptation for deep neural network
    • J. Li, J.-T. Huang, and Y. Gong, Factorized adaptation for deep neural network, in Proc. ICASSP, 2014, pp. 5574-5578
    • (2014) Proc. ICASSP , pp. 5574-5578
    • Li, J.1    Huang, J.-T.2    Gong, Y.3
  • 14
    • 84905229915 scopus 로고    scopus 로고
    • Singular value decomposition based low-footprint speaker adaptation and personalization for deep neural network
    • J. Xue, J. Li, D. Yu, M. Seltzer, and Y. Gong, Singular value decomposition based low-footprint speaker adaptation and personalization for deep neural network, in Proc. ICASSP, 2014, pp. 6409-6413
    • (2014) Proc. ICASSP , pp. 6409-6413
    • Xue, J.1    Li, J.2    Yu, D.3    Seltzer, M.4    Gong, Y.5
  • 15
    • 84878606732 scopus 로고    scopus 로고
    • Hermitian based hidden activation functions for adaptation of hybrid HMM/ANN models
    • S. M. Siniscalchi, J. Li, and C.-H. Lee, Hermitian based hidden activation functions for adaptation of hybrid HMM/ANN models, in Proc. INTERSPEECH, 2012, pp. 2590-2593
    • (2012) Proc. INTERSPEECH , pp. 2590-2593
    • Siniscalchi, S.M.1    Li, J.2    Lee, C.-H.3
  • 16
    • 84881054791 scopus 로고    scopus 로고
    • Hermitian polynomial for speaker adaptation of connectionist speech recognition systems
    • S. M. Siniscalchi, J. Li, and C.-H. Lee, Hermitian polynomial for speaker adaptation of connectionist speech recognition systems, IEEE Trans. Audio, Speech, Lang. Process., vol. 21, no. 10, pp. 2152-2161, 2013
    • (2013) IEEE Trans. Audio, Speech, Lang. Process , vol.21 , Issue.10 , pp. 2152-2161
    • Siniscalchi, S.M.1    Li, J.2    Lee, C.-H.3
  • 17
    • 84983119674 scopus 로고    scopus 로고
    • Learning hidden unit contributions for unsupervised speaker adaptation of neural network acoustic models
    • P. Swietojanski and S. Renals, Learning hidden unit contributions for unsupervised speaker adaptation of neural network acoustic models, in Proc. SLT, 2014, pp. 171-176
    • (2014) Proc. SLT , pp. 171-176
    • Swietojanski, P.1    Renals, S.2
  • 18
    • 84906227589 scopus 로고    scopus 로고
    • Restructuring of deep neural network acoustic models with singular value decomposition
    • J. Xue, J. Li, and Y. Gong, Restructuring of deep neural network acoustic models with singular value decomposition, in Proc. Interspeech, 2013, pp. 2365-2369
    • (2013) Proc. Interspeech , pp. 2365-2369
    • Xue, J.1    Li, J.2    Gong, Y.3
  • 19
    • 0029288633 scopus 로고
    • Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models
    • C. J. Leggetter and P. C. Woodland, Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models, Comput. Speech Lang., vol. 9, pp. 171-186, 1995
    • (1995) Comput. Speech Lang , vol.9 , pp. 171-186
    • Leggetter, C.J.1    Woodland, P.C.2
  • 20
    • 84921817164 scopus 로고
    • Learning representations by back-propagating errors
    • MIT Press, Cambridge, MA
    • D. E. Rumelhart, G. E. Hinton, and R. J. Williams, Learning representations by back-propagating errors, in Cognitive modeling. MIT Press, Cambridge, MA, 1988
    • (1988) Cognitive Modeling
    • Rumelhart, D.E.1    Hinton, G.E.2    Williams, R.J.3
  • 21
    • 84890542079 scopus 로고    scopus 로고
    • KL-divergence regularized deep neural network adaptation for improved large vocabulary speech recognition
    • D. Yu, K. Yao, H. Su, G. Li, and F. Seide, KL-divergence regularized deep neural network adaptation for improved large vocabulary speech recognition, in Proc. ICASSP, 2013, pp. 7893-7897
    • (2013) Proc. ICASSP , pp. 7893-7897
    • Yu, D.1    Yao, K.2    Su, H.3    Li, G.4    Seide, F.5
  • 22
    • 4544253838 scopus 로고    scopus 로고
    • Improving broadcast news transcription by lightly supervised discriminative training
    • H. Y. Chan and P. C. Woodland, Improving broadcast news transcription by lightly supervised discriminative training, in Proc. ICASSP, 2004, pp. 737-740
    • (2004) Proc. ICASSP , pp. 737-740
    • Chan, H.Y.1    Woodland, P.C.2
  • 23
    • 77950063604 scopus 로고    scopus 로고
    • Active learning and semi-supervised learning for speech recognition: A unified framework using the global entropy reduction maximization criterion
    • D. Yu, B. Varadarajan, L. Deng, and A. Acero, Active learning and semi-supervised learning for speech recognition: A unified framework using the global entropy reduction maximization criterion, Comput. Speech Lang., vol. 24, no. 3, pp. 433-444, 2010
    • (2010) Comput. Speech Lang , vol.24 , Issue.3 , pp. 433-444
    • Yu, D.1    Varadarajan, B.2    Deng, L.3    Acero, A.4
  • 24
    • 84893650076 scopus 로고    scopus 로고
    • Semi-supervised training of deep neural networks
    • K. Vesely, M. Hannemann, and L. Burget, Semi-supervised training of deep neural networks, in Proc. ASRU, 2013, pp. 267-272
    • (2013) Proc. ASRU , pp. 267-272
    • Vesely, K.1    Hannemann, M.2    Burget, L.3
  • 25
    • 84906218045 scopus 로고    scopus 로고
    • Semi-supervised GMM and DNN acoustic model training with multi-system combination and confidence re-calibration
    • Y. Huang, D. Yu, Y. Gong, and C. Liu, Semi-supervised GMM and DNN acoustic model training with multi-system combination and confidence re-calibration, in Proc. Interspeech, 2013, pp. 2360-2364
    • (2013) Proc. Interspeech , pp. 2360-2364
    • Huang, Y.1    Yu, D.2    Gong, Y.3    Liu, C.4
  • 26
    • 84905251985 scopus 로고    scopus 로고
    • Training data selection based on contextdependent state matching
    • O. Siohan, Training data selection based on contextdependent state matching, in Proc. ICASSP, 2014, pp. 3316-3319
    • (2014) Proc. ICASSP , pp. 3316-3319
    • Siohan, O.1
  • 27
    • 84905259138 scopus 로고    scopus 로고
    • Improving DNN speaker independence with i-vector inputs
    • A. Senior and I. Lopez-Moreno, Improving DNN speaker independence with i-vector inputs, in Proc. ICASSP, 2014, pp. 225-229
    • (2014) Proc. ICASSP , pp. 225-229
    • Senior, A.1    Lopez-Moreno, I.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.