메뉴 건너뛰기




Volumn 2016-May, Issue , 2016, Pages 5275-5279

On combining i-vectors and discriminative adaptation methods for unsupervised speaker normalization in DNN acoustic models

Author keywords

Automatic speech recognition; deep neural networks; speaker normalization

Indexed keywords


EID: 84973352080     PISSN: 15206149     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1109/ICASSP.2016.7472684     Document Type: Conference Paper
Times cited : (20)

References (30)
  • 2
    • 0028419019 scopus 로고
    • Maximum a posteriori estimation for multivariate Gaussian mixture observations of markov chains
    • J. Gauvain and C. H. Lee, "Maximum a posteriori estimation for multivariate Gaussian mixture observations of markov chains, " IEEE Transactions on Speech and Audio Processing, vol. 2, no. 2, pp. 291-298, 1994.
    • (1994) IEEE Transactions on Speech and Audio Processing , vol.2 , Issue.2 , pp. 291-298
    • Gauvain, J.1    Lee, C.H.2
  • 3
    • 0029288633 scopus 로고
    • Maximum likelihood linear regression for speaker adaptation of continuous density hidden markov models
    • C. J. Leggetter and P. C. Woodland, "Maximum likelihood linear regression for speaker adaptation of continuous density hidden markov models, " Computer Speech & Language, vol. 9, no. 2, pp. 171-185, 1995.
    • (1995) Computer Speech & Language , vol.9 , Issue.2 , pp. 171-185
    • Leggetter, C.J.1    Woodland, P.C.2
  • 4
    • 0033709098 scopus 로고    scopus 로고
    • Tandem connectionist feature extraction for conventional HMM systems
    • H. Hermansky, D. P. W. Ellis, and S. Sharma, "Tandem connectionist feature extraction for conventional HMM systems, " in ICASSP. IEEE, 2000, pp. 1635-1638.
    • (2000) ICASSP. IEEE , pp. 1635-1638
    • Hermansky, H.1    Ellis, D.P.W.2    Sharma, S.3
  • 5
    • 84890537527 scopus 로고    scopus 로고
    • Multi-level adaptive networks in tandem and hybrid ASR systems
    • P. Bell, P. Swietojanski, and S. Renals, "Multi-level adaptive networks in tandem and hybrid ASR systems, " in ICASSP. IEEE, 2013, pp. 6975-6979.
    • (2013) ICASSP. IEEE , pp. 6975-6979
    • Bell, P.1    Swietojanski, P.2    Renals, S.3
  • 6
    • 84973343516 scopus 로고    scopus 로고
    • Learning factorized transforms for speaker normalization
    • L. T. Samarakoon and K. C. Sim, "Learning factorized transforms for speaker normalization, " in ASRU. IEEE, 2015.
    • (2015) ASRU. IEEE
    • Samarakoon, L.T.1    Sim, K.C.2
  • 7
    • 84890542079 scopus 로고    scopus 로고
    • Kl-divergence regularized deep neural network adaptation for improved large vocabulary speech recognition
    • D. Yu, K. Yao, H. Su, G. Li, and F. Seide, "Kl-divergence regularized deep neural network adaptation for improved large vocabulary speech recognition, " in ICASSP. IEEE, 2013, pp. 7893-7897.
    • (2013) ICASSP. IEEE , pp. 7893-7897
    • Yu, D.1    Yao, K.2    Su, H.3    Li, G.4    Seide, F.5
  • 8
    • 84905259145 scopus 로고    scopus 로고
    • I-vectorbased speaker adaptation of deep neural networks for French broadcast audio transcription
    • V. Gupta, P. Kenny, P. Ouellet, and T. Stafylakis, "I-vectorbased speaker adaptation of deep neural networks for French broadcast audio transcription, " in ICASSP. IEEE, 2014, pp. 6334-6338.
    • (2014) ICASSP. IEEE , pp. 6334-6338
    • Gupta, V.1    Kenny, P.2    Ouellet, P.3    Stafylakis, T.4
  • 9
    • 84893691530 scopus 로고    scopus 로고
    • Speaker adaptation of neural network acoustic models using i-vectors
    • G. Saon, H. Soltau, D. Nahamoo, and M. Picheny, "Speaker adaptation of neural network acoustic models using i-vectors, " in ASRU. IEEE, 2013, pp. 55-59.
    • (2013) ASRU. IEEE , pp. 55-59
    • Saon, G.1    Soltau, H.2    Nahamoo, D.3    Picheny, M.4
  • 10
    • 84890452886 scopus 로고    scopus 로고
    • Fast speaker adaptation of hybrid NN/HMM model for speech recognition based on discriminative learning of speaker code
    • O. Abdel-Hamid and H. Jiang, "Fast speaker adaptation of hybrid NN/HMM model for speech recognition based on discriminative learning of speaker code, " in ICASSP. IEEE, 2013, pp. 7942-7946.
    • (2013) ICASSP. IEEE , pp. 7942-7946
    • Abdel-Hamid, O.1    Jiang, H.2
  • 11
    • 84983119674 scopus 로고    scopus 로고
    • Learning hidden unit contributions for unsupervised speaker adaptation of neural network acoustic models
    • P. Swietojanski and S. Renals, "Learning hidden unit contributions for unsupervised speaker adaptation of neural network acoustic models, " in SLT. IEEE, 2014, pp. 171-176.
    • (2014) SLT. IEEE , pp. 171-176
    • Swietojanski, P.1    Renals, S.2
  • 12
    • 84937880519 scopus 로고
    • Connectionist speaker normalization and adaptation
    • V. Abrash, H. Franco, A. Sankar, and M. Cohen, "Connectionist speaker normalization and adaptation, " in Eurospeech. ISCA, 1995, pp. 2183-2186.
    • (1995) Eurospeech. ISCA , pp. 2183-2186
    • Abrash, V.1    Franco, H.2    Sankar, A.3    Cohen, M.4
  • 13
    • 79959849500 scopus 로고    scopus 로고
    • Comparison of discriminative input and output transformation for speaker adaptation in the hybrid nn/hmm systems
    • B. Li and K. C. Sim, "Comparison of discriminative input and output transformation for speaker adaptation in the hybrid nn/hmm systems, " in INTERSPEECH. ISCA, 2010, pp. 526-529.
    • (2010) Interspeech. ISCA , pp. 526-529
    • Li, B.1    Sim, K.C.2
  • 14
    • 33947703156 scopus 로고    scopus 로고
    • Adaptation of hybrid ANN/HMM models using linear hidden transformations and conservative training
    • R. Gemello, F. Mana, S. Scanzio, P. Laface, and R. D. Mori, "Adaptation of hybrid ANN/HMM models using linear hidden transformations and conservative training, " in ICASSP. IEEE, 2006, pp. 1189-1192.
    • (2006) ICASSP. IEEE , pp. 1189-1192
    • Gemello, R.1    Mana, F.2    Scanzio, S.3    Laface, P.4    Mori, R.D.5
  • 15
    • 44849101939 scopus 로고    scopus 로고
    • Regularized adaptation of discriminative classifiers
    • X. Li and J. Bilmes, "Regularized adaptation of discriminative classifiers, " in ICASSP. IEEE, 2006, vol. 1, pp. I-I.
    • (2006) ICASSP. IEEE , vol.1 , pp. I-I
    • Li, X.1    Bilmes, J.2
  • 16
    • 33646794050 scopus 로고    scopus 로고
    • Two-stage speaker adaptation of hybrid tied-posterior acoustic models
    • J. Stadermann and G. Rigoll, "Two-stage speaker adaptation of hybrid tied-posterior acoustic models, " in ICASSP. IEEE, 2005, pp. 977-980.
    • (2005) ICASSP. IEEE , pp. 977-980
    • Stadermann, J.1    Rigoll, G.2
  • 17
    • 84874226579 scopus 로고    scopus 로고
    • Adaptation of context-dependent deep neural networks for automatic speech recognition
    • K. Yao, D. Yu, F. Seide, H. Su, L. Deng, and Y. Gong, "Adaptation of context-dependent deep neural networks for automatic speech recognition, " in SLT. IEEE, 2012, pp. 366-369.
    • (2012) SLT. IEEE , pp. 366-369
    • Yao, K.1    Yu, D.2    Seide, F.3    Su, H.4    Deng, L.5    Gong, Y.6
  • 18
    • 84946032695 scopus 로고    scopus 로고
    • Differentiable pooling for unsupervised speaker adaptation
    • P. Swietojanski and S. Renals, "Differentiable pooling for unsupervised speaker adaptation, " in ICASSP. IEEE, 2015, pp. 4305-4309.
    • (2015) ICASSP. IEEE , pp. 4305-4309
    • Swietojanski, P.1    Renals, S.2
  • 19
    • 84905259138 scopus 로고    scopus 로고
    • Improving dnn speaker independence with i-vector inputs
    • A. Senior and I. Lopez-Moreno, "Improving dnn speaker independence with i-vector inputs, " in ICASSP. IEEE, 2014, pp. 225-229.
    • (2014) ICASSP. IEEE , pp. 225-229
    • Senior, A.1    Lopez-Moreno, I.2
  • 20
    • 84946035423 scopus 로고    scopus 로고
    • An investigation of augmenting speaker representations to improve speaker normalization for DNN-based speech recognition
    • H. Huang and K. C. Sim, "An investigation of augmenting speaker representations to improve speaker normalization for DNN-based speech recognition, " in ICASSP. IEEE, 2015, pp. 4610-4613.
    • (2015) ICASSP. IEEE , pp. 4610-4613
    • Huang, H.1    Sim, K.C.2
  • 21
    • 84946083667 scopus 로고    scopus 로고
    • Cluster adaptive training for deep neural network
    • T. Tian, Q. Yanmin, Y. Maofan, Z. Yimeng, and K. Yu, "Cluster adaptive training for deep neural network, " in ICASSP. IEEE, 2015, pp. 4325-4329.
    • (2015) ICASSP. IEEE , pp. 4325-4329
    • Tian, T.1    Yanmin, Q.2    Maofan, Y.3    Yimeng, Z.4    Yu, K.5
  • 22
    • 84946054484 scopus 로고    scopus 로고
    • Multi-basis adaptive neural network for rapid adaptation in speech recognition
    • C. Wu and M. J. F. Gales, "Multi-basis adaptive neural network for rapid adaptation in speech recognition, " in ICASSP. IEEE, 2015, pp. 4315-4319.
    • (2015) ICASSP. IEEE , pp. 4315-4319
    • Wu, C.1    Gales, M.J.F.2
  • 23
    • 84973382376 scopus 로고    scopus 로고
    • Towards speaker adaptive training of deep neural network acoustic models
    • Yajie M., Hao Z., and Florian M., "Towards speaker adaptive training of deep neural network acoustic models, " in INTERSPEECH. ISCA, 2014.
    • (2014) Interspeech. ISCA
    • Yajie, M.1    Hao, Z.2    Florian, M.3
  • 25
    • 84946076428 scopus 로고    scopus 로고
    • Ted-lium: An automatic speech recognition dedicated corpus
    • A. Rousseau, P. Deléglise, and Y. Esteve, "Ted-lium: an automatic speech recognition dedicated corpus., " in LREC. ELRA, 2012, pp. 125-129.
    • (2012) LREC. ELRA , pp. 125-129
    • Rousseau, A.1    Deléglise, P.2    Esteve, Y.3
  • 26
    • 0032638856 scopus 로고    scopus 로고
    • Semi-tied covariance matrices for hidden markov models
    • M. J. F. Gales, "Semi-tied covariance matrices for hidden markov models, " IEEE Transactions on Speech and Audio Processing, vol. 7, no. 3, pp. 272-281, 1999.
    • (1999) IEEE Transactions on Speech and Audio Processing , vol.7 , Issue.3 , pp. 272-281
    • Gales, M.J.F.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.