메뉴 건너뛰기




Volumn , Issue , 2016, Pages 145-152

Learning factorized feature transforms for speaker normalization

Author keywords

Automatic speech recognition; deep neural networks; speaker normalization

Indexed keywords

MAXIMUM LIKELIHOOD; MAXIMUM LIKELIHOOD ESTIMATION;

EID: 84964489805     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1109/ASRU.2015.7404787     Document Type: Conference Paper
Times cited : (7)

References (31)
  • 2
    • 0028419019 scopus 로고
    • Maximum a posteriori estimation for multivariate Gaussian mixture observations of markov chains
    • J. Gauvain and C.H. Lee, "Maximum a posteriori estimation for multivariate Gaussian mixture observations of markov chains," IEEE Transactions on Speech and Audio Processing, vol. 2, no. 2, pp. 291-298, 1994
    • (1994) IEEE Transactions on Speech and Audio Processing , vol.2 , Issue.2 , pp. 291-298
    • Gauvain, J.1    Lee, C.H.2
  • 3
    • 0029288633 scopus 로고
    • Maximum likelihood linear regression for speaker adaptation of continuous density hidden markov models
    • C. J. Leggetter and P. C. Woodland, "Maximum likelihood linear regression for speaker adaptation of continuous density hidden markov models," Computer Speech and Language, vol. 9, no. 2, pp. 171-186, 1995
    • (1995) Computer Speech and Language , vol.9 , Issue.2 , pp. 171-186
    • Leggetter, C.J.1    Woodland, P.C.2
  • 4
    • 0033709098 scopus 로고    scopus 로고
    • Tandem connectionist feature extraction for conventional HMM systems
    • H. Hermansky, D.P.W. Ellis, and S. Sharma, "Tandem connectionist feature extraction for conventional HMM systems," in Proc. ICASSP, 2000, pp. 1635-1638
    • (2000) Proc. ICASSP , pp. 1635-1638
    • Hermansky, H.1    Ellis, D.P.W.2    Sharma, S.3
  • 5
    • 84890537527 scopus 로고    scopus 로고
    • Multi-level adaptive networks in tandem and hybrid ASR systems
    • P. Bell, P. Swietojanski, and S. Renals, "Multi-level adaptive networks in tandem and hybrid ASR systems," in Proc. ICASSP, 2013, pp. 6975-6979
    • (2013) Proc. ICASSP , pp. 6975-6979
    • Bell, P.1    Swietojanski, P.2    Renals, S.3
  • 6
    • 0025659256 scopus 로고
    • Continuous speech recognition using multilayer perceptrons with hidden markov models
    • N. Morgan and H. Bourlard, "Continuous speech recognition using multilayer perceptrons with hidden markov models," in Proc. ICASSP, 1990, pp. 413-416
    • (1990) Proc. ICASSP , pp. 413-416
    • Morgan, N.1    Bourlard, H.2
  • 7
    • 84890542079 scopus 로고    scopus 로고
    • Kldivergence regularized deep neural network adaptation for improved large vocabulary speech recognition
    • D. Yu, K. Yao, H. Su, G. Li, and F. Seide, "Kldivergence regularized deep neural network adaptation for improved large vocabulary speech recognition," in Proc. ICASSP, 2013, pp. 7893-7897
    • (2013) Proc. ICASSP , pp. 7893-7897
    • Yu, D.1    Yao, K.2    Su, H.3    Li, G.4    Seide, F.5
  • 8
    • 84905259145 scopus 로고    scopus 로고
    • Ivector-based speaker adaptation of deep neural networks for French broadcast audio transcription
    • V. Gupta, P. Kenny, P. Ouellet, and T. Stafylakis, "Ivector-based speaker adaptation of deep neural networks for French broadcast audio transcription," in Proc. ICASSP, 2014, pp. 6334-6338
    • (2014) Proc. ICASSP , pp. 6334-6338
    • Gupta, V.1    Kenny, P.2    Ouellet, P.3    Stafylakis, T.4
  • 9
    • 84893691530 scopus 로고    scopus 로고
    • Speaker adaptation of neural network acoustic models using i-vectors
    • G. Saon, H. Soltau, D. Nahamoo, and M. Picheny, "Speaker adaptation of neural network acoustic models using i-vectors," in Proc. ASRU, 2013, pp. 55-59
    • (2013) Proc. ASRU , pp. 55-59
    • Saon, G.1    Soltau, H.2    Nahamoo, D.3    Picheny, M.4
  • 10
    • 84890452886 scopus 로고    scopus 로고
    • Fast speaker adaptation of hybrid NN/HMM model for speech recognition based on discriminative learning of speaker code
    • O. Abdel-Hamid and H. Jiang, "Fast speaker adaptation of hybrid NN/HMM model for speech recognition based on discriminative learning of speaker code," in Proc. ICASSP, 2013, pp. 7942-7946
    • (2013) Proc. ICASSP , pp. 7942-7946
    • Abdel-Hamid, O.1    Jiang, H.2
  • 13
    • 84937880519 scopus 로고
    • Connectionist speaker normalization and adaptation
    • V. Abrash, H. Franco, A. Sankar, and M. Cohen, "Connectionist speaker normalization and adaptation," in Proc. Eurospeech, 1995, pp. 2183-2186
    • (1995) Proc. Eurospeech , pp. 2183-2186
    • Abrash, V.1    Franco, H.2    Sankar, A.3    Cohen, M.4
  • 14
    • 79959849500 scopus 로고    scopus 로고
    • Comparison of discriminative input and output transformation for speaker adaptation in the hybrid nn/hmm systems
    • B. Li and K. C. Sim, "Comparison of discriminative input and output transformation for speaker adaptation in the hybrid nn/hmm systems," in Proc. Interspeech, 2010, pp. 526-529
    • (2010) Proc. Interspeech , pp. 526-529
    • Li, B.1    Sim, K.C.2
  • 15
    • 84858976070 scopus 로고    scopus 로고
    • Feature engineering in context-dependent deep neural networks for conversational speech transcription
    • F. Seide, Gang Li, Xie Chen, and Dong Yu, "Feature engineering in context-dependent deep neural networks for conversational speech transcription," in Proc. ASRU, 2011, pp. 24-29
    • (2011) Proc. ASRU , pp. 24-29
    • Seide, F.1    Li, G.2    Chen, X.3    Yu, D.4
  • 16
    • 33947703156 scopus 로고    scopus 로고
    • Adaptation of hybrid ANN/HMM models using linear hidden transformations and conservative training
    • R. Gemello, F. Mana, S. Scanzio, P. Laface, and R. D. Mori, "Adaptation of hybrid ANN/HMM models using linear hidden transformations and conservative training," in Proc. ICASSP, 2006, pp. 1189-1192
    • (2006) Proc. ICASSP , pp. 1189-1192
    • Gemello, R.1    Mana, F.2    Scanzio, S.3    Laface, P.4    Mori, R.D.5
  • 17
    • 33947635130 scopus 로고    scopus 로고
    • Regularized adaptation of discriminative classifiers
    • X. Li and J. Bilmes, "Regularized adaptation of discriminative classifiers," in Proc. ICASSP, 2006, vol. 1, pp. 1-1
    • (2006) Proc. ICASSP , vol.1 , pp. 1
    • Li, X.1    Bilmes, J.2
  • 18
    • 84905229915 scopus 로고    scopus 로고
    • Singular value decomposition based low-footprint speaker adaptation and personalization for deep neural network
    • J. Xue, J. Li, D. Yu, M. Seltzer, and Y. Gong, "Singular value decomposition based low-footprint speaker adaptation and personalization for deep neural network," in Proc. ICASSP, 2014, pp. 6359-6363
    • (2014) Proc. ICASSP , pp. 6359-6363
    • Xue, J.1    Li, J.2    Yu, D.3    Seltzer, M.4    Gong, Y.5
  • 19
    • 33646794050 scopus 로고    scopus 로고
    • Two-stage speaker adaptation of hybrid tied-posterior acoustic models
    • J. Stadermann and G. Rigoll, "Two-stage speaker adaptation of hybrid tied-posterior acoustic models," in Proc. ICASSP, 2005, pp. 977-980
    • (2005) Proc. ICASSP , pp. 977-980
    • Stadermann, J.1    Rigoll, G.2
  • 20
    • 84874226579 scopus 로고    scopus 로고
    • Adaptation of context-dependent deep neural networks for automatic speech recognition
    • K. Yao, D. Yu, F. Seide, H. Su, L. Deng, and Y. Gong, "Adaptation of context-dependent deep neural networks for automatic speech recognition," in Proc. SLT, 2012, pp. 366-369
    • (2012) Proc. SLT , pp. 366-369
    • Yao, K.1    Yu, D.2    Seide, F.3    Su, H.4    Deng, L.5    Gong, Y.6
  • 21
    • 84983119674 scopus 로고    scopus 로고
    • Learning hidden unit contributions for unsupervised speaker adaptation of neural network acoustic models
    • P. Swietojanski and S. Renals, "Learning hidden unit contributions for unsupervised speaker adaptation of neural network acoustic models," in Proc. SLT, 2014, pp. 171-176
    • (2014) Proc. SLT , pp. 171-176
    • Swietojanski, P.1    Renals, S.2
  • 22
    • 84946032695 scopus 로고    scopus 로고
    • Differentiable pooling for unsupervised speaker adaptation
    • P. Swietojanski and S. Renals, "Differentiable pooling for unsupervised speaker adaptation," in Proc. ICASSP, 2015, pp. 4305-4309
    • (2015) Proc. ICASSP , pp. 4305-4309
    • Swietojanski, P.1    Renals, S.2
  • 23
    • 0033677005 scopus 로고    scopus 로고
    • Fast speaker adaptation of artificial neural networks for automatic speech recognition
    • S. Dupont and L. Cheboub, "Fast speaker adaptation of artificial neural networks for automatic speech recognition," in Proc. ICASSP, 2000, pp. 1795-1798
    • (2000) Proc. ICASSP , pp. 1795-1798
    • Dupont, S.1    Cheboub, L.2
  • 24
    • 84946083667 scopus 로고    scopus 로고
    • Cluster adaptive training for deep neural network
    • T. Tian, Q. Yanmin, Y. Maofan, Z. Yimeng, and K. Yu, "Cluster adaptive training for deep neural network," in Proc. ICASSP, 2015, pp. 4325-4329
    • (2015) Proc. ICASSP , pp. 4325-4329
    • Tian, T.1    Yanmin, Q.2    Maofan, Y.3    Yimeng, Z.4    Yu, K.5
  • 25
    • 84946054484 scopus 로고    scopus 로고
    • Multi-basis adaptive neural network for rapid adaptation in speech recognition
    • C. Wu and M. J. F. Gales, "Multi-basis adaptive neural network for rapid adaptation in speech recognition," in Proc. ICASSP, 2015, pp. 4315-4319
    • (2015) Proc. ICASSP , pp. 4315-4319
    • Wu, C.1    Gales, M.J.F.2
  • 26
    • 84905259138 scopus 로고    scopus 로고
    • Improving dnn speaker independence with i-vector inputs
    • A. Senior and I. Lopez-Moreno, "Improving dnn speaker independence with i-vector inputs," in Proc. ICASSP, 2014, pp. 225-229
    • (2014) Proc. ICASSP , pp. 225-229
    • Senior, A.1    Lopez-Moreno, I.2
  • 27
    • 84946035423 scopus 로고    scopus 로고
    • An investigation of augmenting speaker representations to improve speaker normalization for DNN-based speech recognition
    • H. Huang and K. C. Sim, "An investigation of augmenting speaker representations to improve speaker normalization for DNN-based speech recognition," in Proc. ICASSP, 2015, pp. 4610-4613
    • (2015) Proc. ICASSP , pp. 4610-4613
    • Huang, H.1    Sim, K.C.2
  • 28
    • 0032050110 scopus 로고    scopus 로고
    • Maximum likelihood linear transformations for hmm-based speech recognition
    • M.J.F. Gales, "Maximum likelihood linear transformations for hmm-based speech recognition," Computer speech &language, vol. 12, no. 2, pp. 75-98, 1998
    • (1998) Computer Speech &Language , vol.12 , Issue.2 , pp. 75-98
    • Gales, M.J.F.1
  • 29
    • 0032638856 scopus 로고    scopus 로고
    • Semi-tied covariance matrices for hidden markov models
    • M.J.F. Gales, "Semi-tied covariance matrices for hidden markov models," IEEE Transactions on Speech and Audio Processing, vol. 7, no. 3, pp. 272-281, 1999
    • (1999) IEEE Transactions on Speech and Audio Processing , vol.7 , Issue.3 , pp. 272-281
    • Gales, M.J.F.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.