메뉴 건너뛰기




Volumn 2015-August, Issue , 2015, Pages 4610-4613

An investigation of augmenting speaker representations to improve speaker normalisation for DNN-based speech recognition

Author keywords

augmented speaker representation; deep neural network; speaker normalisation; speech recognition

Indexed keywords

AUDIO SIGNAL PROCESSING; DEEP NEURAL NETWORKS; LOUDSPEAKERS; SPEECH; SPEECH COMMUNICATION; VECTOR SPACES;

EID: 84946035423     PISSN: 15206149     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1109/ICASSP.2015.7178844     Document Type: Conference Paper
Times cited : (53)

References (12)
  • 2
    • 84893691530 scopus 로고    scopus 로고
    • Speaker adaptation of neural network acoustic models using i-vectors
    • G. Saon, H. Soltau, D. Nahamoo, and M. Picheny, Speaker adaptation of neural network acoustic models using i-vectors, in ASRU, 2013
    • (2013) ASRU
    • Saon, G.1    Soltau, H.2    Nahamoo, D.3    Picheny, M.4
  • 3
    • 84910031119 scopus 로고    scopus 로고
    • Towards speaker adaptive training of deep neural network acoustic models
    • Y. Miao, H. Zhang, and F. Metze, Towards speaker adaptive training of deep neural network acoustic models, in Proc. Interspeech, 2014
    • (2014) Proc. Interspeech
    • Miao, Y.1    Zhang, H.2    Metze, F.3
  • 5
    • 0029288633 scopus 로고
    • Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models
    • C. J. Leggetter and P. C. Woodland, Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models, Computer Speech &Language, vol. 9, no. 2, 1995
    • (1995) Computer Speech &Language , vol.9 , Issue.2
    • Leggetter, C.J.1    Woodland, P.C.2
  • 6
    • 0032050110 scopus 로고    scopus 로고
    • Maximum likelihood linear transformations for hmm-based speech recognition
    • M.J.F. Gales, Maximum likelihood linear transformations for hmm-based speech recognition, Computer Speech &Language, vol. 12, 1998
    • (1998) Computer Speech &Language , vol.12
    • Gales, M.J.F.1
  • 7
    • 84858976070 scopus 로고    scopus 로고
    • Feature engineering in context-dependent deep neural networks for conversational speech transcription
    • Frank Seide, Gang Li, Xie Chen, and Dong Yu, Feature engineering in context-dependent deep neural networks for conversational speech transcription, in ASRU, 2011, pp. 24-29
    • (2011) ASRU , pp. 24-29
    • Seide, F.1    Li, G.2    Chen, X.3    Yu, D.4
  • 10
    • 33745220290 scopus 로고    scopus 로고
    • Modeling intra-speaker variability for speaker recognition
    • Hagai Aronowitz, Dror Irony, and David Burstein, Modeling intra-speaker variability for speaker recognition, in Proc. Interspeech, 2005
    • (2005) Proc. Interspeech
    • Aronowitz, H.1    Irony, D.2    Burstein, D.3
  • 11
    • 84901456587 scopus 로고    scopus 로고
    • Bottleneck features for speaker recognition
    • Sibel Yaman, Jason Pelecanos, and Ruhi Sarikaya, Bottleneck features for speaker recognition, in Proc. Odyssey, 2012, vol. 12
    • (2012) Proc. Odyssey , vol.12
    • Yaman, S.1    Pelecanos, J.2    Sarikaya, R.3
  • 12
    • 85079095310 scopus 로고
    • The design for the Wall Street Journal-based CSR corpus
    • D. B. Paul and J. M. Baker, The design for the Wall Street Journal-based CSR corpus, in Proc. ICSLP, 1992
    • (1992) Proc. ICSLP
    • Paul, D.B.1    Baker, J.M.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.