SCOPUS 정보 검색 플랫폼

ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings

Volumn , Issue , 2014, Pages 6329-6333

Deep neural network trained with speaker representation for speaker normalization

(4) Tang, Yun a Mohan, Aanchan b Rose, Richard C b Ma, Chengyuan a

a NUANCE COMMUNICATIONS (United States)

b MCGILL UNIVERSITY (Canada)

Author keywords

Neural networks; speaker adaptation; speaker normalization

Indexed keywords

FEATURE EXTRACTION; HIDDEN MARKOV MODELS; NEURAL NETWORKS; SPEECH RECOGNITION;

AUTOMATIC SPEECH RECOGNITION; DEEP NEURAL NETWORKS; DISCRIMINATIVE FEATURES; INCORPORATING PRIOR KNOWLEDGE; SPEAKER ADAPTATION; SPEAKER NORMALIZATION; SPEAKER VARIABILITY; SPECTRAL MAGNITUDES;

SIGNAL PROCESSING;

EID: 84905265988 PISSN: 15206149 EISSN: None Source Type: Conference Proceeding
DOI: 10.1109/ICASSP.2014.6854822 Document Type: Conference Paper

Times cited : (3)

References (18)

1
- 85032751458
- Deep neural networks for acoustic modeling in speech recognition
- November
- G. Hinton, L. Deng, D. Yu, et al., "Deep neural networks for acoustic modeling in speech recognition," IEEE Signal Processing Magazine, vol. 29, pp. 82-97, November 2012.
- (2012) IEEE Signal Processing Magazine , vol.29 , pp. 82-97
- Hinton, G.¹ Deng, L.² Yu, D.³

2
- 0033709098
- Tandem connectionist feature extraction for conventional HMM systems
- Daniel P. W. Ellis Hynek Hermansky and Sangita Sharma, "Tandem connectionist feature extraction for conventional HMM systems," in ICASSP, 2000, pp. 1635-1638.
- (2000) ICASSP , pp. 1635-1638
- Ellis, D.P.W.¹ Hermansky, H.² Sharma, S.³

3
- 84867593213
- Auto-encoder bottleneck features using deep belief networks
- T. Sainath, B. Kingsbury, and B. Ramabhadran, "Auto-encoder bottleneck features using deep belief networks," in ICASSP, 2012, pp. 4153-4156.
- (2012) ICASSP , pp. 4153-4156
- Sainath, T.¹ Kingsbury, B.² Ramabhadran, B.³

4
- 0009623939
- Flexible speaker adaptation using maximum likelihood linear regression
- C. Leggetter and P. Woodland, "Flexible speaker adaptation using maximum likelihood linear regression," in the ARPA Spoken Language Technology Workshop, 1995, pp. 104-109.
- (1995) The ARPA Spoken Language Technology Workshop , pp. 104-109
- Leggetter, C.¹ Woodland, P.²

5
- 0032050110
- Maximum likelihood linear transformations for HMM-based speech recognition
- M. J. F. Gales, "Maximum likelihood linear transformations for HMM-based speech recognition," Computer Speech and Language, vol. 12, pp. 75-98, 1998.
- (1998) Computer Speech and Language , vol.12 , pp. 75-98
- Gales, M.J.F.¹

6
- 84890509526
- MLP-based factor analysis for tandem speech recognition
- M. Ferras and H. Bourlard, "MLP-based factor analysis for tandem speech recognition," in ICASSP, 2013.
- (2013) ICASSP
- Ferras, M.¹ Bourlard, H.²

7
- 84890452886
- Fast speaker adaptation of hybrid NN/HMM model for speech recognition based on discriminative learning of speaker code
- O. Abdel-Hamid and H. Jiang, "Fast speaker adaptation of hybrid NN/HMM model for speech recognition based on discriminative learning of speaker code," in ICASSP, 2013.
- (2013) ICASSP
- Abdel-Hamid, O.¹ Jiang, H.²

8
- 84906225505
- Rapid and effective speaker adaptation of convolutional neural network based models for speech recognition
- O. Abdel-Hamid and H. Jiang, "Rapid and effective speaker adaptation of convolutional neural network based models for speech recognition," in INTERSPEECH, 2013.
- (2013) INTERSPEECH
- Abdel-Hamid, O.¹ Jiang, H.²

9
- 84876231242
- Imagenet classification with deep convolutional neural networks
- A. Krizhevsky, I. Sutskever, and G. Hinton, "Imagenet classification with deep convolutional neural networks," in NIPS, 2012.
- (2012) NIPS
- Krizhevsky, A.¹ Sutskever, I.² Hinton, G.³

10
- 84890527827
- Improving deep neural network for LVCSR using recified linear units and dropout
- G. Dahl, T. Sainath, and G. Hinton, "Improving deep neural network for LVCSR using recified linear units and dropout," in ICASSP, 2013.
- (2013) ICASSP
- Dahl, G.¹ Sainath, T.² Hinton, G.³

11
- 84890471125
- On rectified linear units for speech processing
- M. Zeiler, M. Ranzato, R. Monga, M. Mao, K. Yang, Q. Le, P. Nguyen, A. Senior, V. Vanhoucke, J. Dean, and G. Hinton, "On rectified linear units for speech processing," in ICASSP, 2013.
- (2013) ICASSP
- Zeiler, M.¹ Ranzato, M.² Monga, R.³ Mao, M.⁴ Yang, K.⁵ Le, Q.⁶ Nguyen, P.⁷ Senior, A.⁸ Vanhoucke, V.⁹ Dean, J.¹⁰ Hinton, G.¹¹

12
- 84937854847
- Speaker adaptation for hybrid HMM-ANN continuous speech recognition system
- J. Neto, L. Almeida, M. Hochberg, C. Martins, L. Nunes, S. Renals, and T. Robinson, "Speaker adaptation for hybrid HMM-ANN continuous speech recognition system," in Eurospeech, 1995, pp. 2171-2174.
- (1995) Eurospeech , pp. 2171-2174
- Neto, J.¹ Almeida, L.² Hochberg, M.³ Martins, C.⁴ Nunes, L.⁵ Renals, S.⁶ Robinson, T.⁷

13
- 34548012893
- Linear hidden transformations for adaptation of hybrid ANN/HMM models
- R. Gemello, F. Mana, S. Scanzio, P. Laface, and R. de Mori, "Linear hidden transformations for adaptation of hybrid ANN/HMM models," Speech Communication, vol. 49, no. 10-11, pp. 827-835, 2007.
- (2007) Speech Communication , vol.49 , Issue.10-11 , pp. 827-835
- Gemello, R.¹ Mana, F.² Scanzio, S.³ Laface, P.⁴ De Mori, R.⁵

14
- 0030362995
- A compact model for speaker-adaptive training
- vol. 2
- T. Anastasakos, J. McDonough, R. Schwartz, and J. Makhoul, "A compact model for speaker-adaptive training," in ICSLP, 1996, vol. 2, pp. 1137-1140 vol. 2.
- (1996) ICSLP , vol.2 , pp. 1137-1140
- Anastasakos, T.¹ McDonough, J.² Schwartz, R.³ Makhoul, J.⁴

15
- 80053455455
- Tech. Rep., University of Toronto, Department of Computer Science
- Tijmen Tieleman, "Gnumpy: an easy way to use GPU boards in python," Tech. Rep., University of Toronto, Department of Computer Science, 2010.
- (2010) Gnumpy: An Easy Way to Use GPU Boards in Python
- Tieleman, T.¹

16
- 78149337911
- Tech. Rep., University of Toronto, Department of Computer Science
- Volodymyr Mnih, "Cudamat: a CUDA-based matrix class for python," Tech. Rep., University of Toronto, Department of Computer Science, 2009.
- (2009) Cudamat: A CUDA-based Matrix Class for Python
- Mnih, V.¹

17
- 84865742011
- A study on speaker normalized MLP features in LVCSR
- Z. Tuske, C. Plahl, and R. Schluter, "A study on speaker normalized MLP features in LVCSR," in Interspeech, Auguest 2011, pp. 1089-1092.
- (2011) Interspeech, Auguest , pp. 1089-1092
- Tuske, Z.¹ Plahl, C.² Schluter, R.³

18
- 84890519798
- Tandem system adaptation using multiple linear feature transforms
- Y. Wang and M. J. F. Gales, "Tandem system adaptation using multiple linear feature transforms," in ICASSP, 2013.
- (2013) ICASSP
- Wang, Y.¹ Gales, M.J.F.²

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.