SCOPUS 정보 검색 플랫폼

ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings

Volumn , Issue , 2013, Pages 6935-6939

Effectiveness of discriminative training and feature transformation for reverberated and noisy speech

(3) Tachioka, Yuuki a Watanabe, Shinji b Hershey, John R b

a MITSUBISHI ELECTRIC CORPORATION (Japan)

b MITSUBISHI ELECTRIC RESEARCH LABORATORIES (United States)

Author keywords

Augmented discriminative feature transformation; CHiME challenge; Discriminative training; Feature transformation; Kaldi

Indexed keywords

CHIME CHALLENGE; DISCRIMINATIVE FEATURES; DISCRIMINATIVE TRAINING; FEATURE TRANSFORMATIONS; KALDI;

REVERBERATION; SIGNAL PROCESSING;

SPEECH RECOGNITION;

EID: 84890503970 PISSN: 15206149 EISSN: None Source Type: Conference Proceeding
DOI: 10.1109/ICASSP.2013.6639006 Document Type: Conference Paper

Times cited : (13)

References (26)

1
- 85032751593
- Research developments and directions in speech recognition and understanding part 1
- May
- J.M. Baker, L. Deng, J. Glass, S. Khudanpur, C.H. Lee, N. Morgan, and D. O'Shaughnessy, "Research developments and directions in speech recognition and understanding part 1," IEEE Signal Processing Magazine, vol. 26, pp. 75-80, May 2009
- (2009) IEEE Signal Processing Magazine , vol.26 , pp. 75-80
- Baker, J.M.¹ Deng, L.² Glass, J.³ Khudanpur, S.⁴ Lee, C.H.⁵ Morgan, N.⁶ O'shaughnessy, D.⁷

2
- 85032750905
- Discriminative learning in sequential pattern recognition
- September
- X. He, L. Deng, and W. Chou, "Discriminative learning in sequential pattern recognition," IEEE Signal Processing Magazine, vol. 25, pp. 14-36, September 2008
- (2008) IEEE Signal Processing Magazine , vol.25 , pp. 14-36
- He, X.¹ Deng, L.² Chou, W.³

3
- 0022890536
- Maximum mutual information estimation of hidden Markov model parameters for speech recognition
- L. Bahl, P. Brown, P. de Souza, and R. Mercer "Maximum mutual information estimation of hidden Markov model parameters for speech recognition," in Proceedings ICASSP. IEEE, 1986, pp. 49-52
- (1986) Proceedings ICASSP. IEEE , pp. 49-52
- Bahl, L.¹ Brown, P.² De Souza, P.³ Mercer, R.⁴

4
- 0036296863
- Minimum phone error and Ismoothing for improved discriminative training
- D. Povey, and P.C. Woodland, "Minimum phone error and Ismoothing for improved discriminative training," in Proceedings ICASSP. IEEE, 2002, pp. 105-108
- (2002) Proceedings ICASSP. IEEE , pp. 105-108
- Povey, D.¹ Woodland, P.C.²

5
- 34547522070
- Discriminative training for large-vocabulary speech recognition using minimum classification error
- January
- E. McDermott, T.J. Hazen, J. Le Roux, A. Nakamura, and S. Katagiri, "Discriminative training for large-vocabulary speech recognition using minimum classification error," IEEE Transactions on Audio, Speech, and Language Processing, vol. 15, pp. 203-223, January 2007
- (2007) IEEE Transactions on Audio, Speech, and Language Processing , vol.15 , pp. 203-223
- McDermott, E.¹ Hazen, T.J.² Le Roux, J.³ Nakamura, A.⁴ Katagiri, S.⁵

6
- 78049409757
- Discriminative training based on an integrated view of MPE and MMI in margin and error space
- E. McDermott, S. Watanabe, and A. Nakamura, "Discriminative training based on an integrated view of MPE and MMI in margin and error space," in Proceedings ICASSP. IEEE, 2010, pp. 4894-4897
- (2010) Proceedings ICASSP. IEEE , pp. 4894-4897
- McDermott, E.¹ Watanabe, S.² Nakamura, A.³

7
- 85017287487
- Linear discriminant analysis for improved large vocabulary continuous speech recognition
- R. Haeb-Umbach and H. Ney, "Linear discriminant analysis for improved large vocabulary continuous speech recognition," in Proceedings ICASSP. IEEE, 1992, pp. 13-16
- (1992) Proceedings ICASSP. IEEE , pp. 13-16
- Haeb-Umbach, R.¹ Ney, H.²

8
- 84892187452
- Maximum likelihood modeling with Gaussian distributions for classification
- R.A. Gopinath, "Maximum likelihood modeling with Gaussian distributions for classification," in Proceedings ICASSP. IEEE, 1998, pp. 661-664
- (1998) Proceedings ICASSP. IEEE , pp. 661-664
- Gopinath, R.A.¹

9
- 0032638856
- Semi-tied covariance matrices for hidden Markov models
- July
- M.J.F. Gales, "Semi-tied covariance matrices for hidden Markov models," IEEE Transactions on Speech and Audio Processing, vol. 7, pp. 272-281, July 1999
- (1999) IEEE Transactions on Speech and Audio Processing , vol.7 , pp. 272-281
- Gales, M.J.F.¹

10
- 0030362995
- A compact model for speaker-adaptive training
- ISCA
- T. Anastasakos, J. McDonough, R. Schwartz, and J. Makhoul, "A Compact Model for Speaker-Adaptive Training," in Proceedings ICSLP, ISCA, 1996, pp. 1137-1140
- (1996) Proceedings ICSLP , pp. 1137-1140
- Anastasakos, T.¹ McDonough, J.² Schwartz, R.³ Makhoul, J.⁴

11
- 33646788786
- FMPE: Discriminatively trained features for speech recognition
- D. Povey, B. Kingsbury, L. Mangu, G. Saon, H. Soltau, and G. Zweig, "fMPE: Discriminatively trained features for speech recognition," in Proceedings ICASSP. IEEE, 2005, pp. 961-964
- (2005) Proceedings ICASSP. IEEE , pp. 961-964
- Povey, D.¹ Kingsbury, B.² Mangu, L.³ Saon, G.⁴ Soltau, H.⁵ Zweig, G.⁶

12
- 85032751458
- Deep neural networks for acoustic modeling in speech recognition
- November
- G. Hinton, L. Deng, D. Yu, G. Dahl, A. Mohamed, N. Jaitly, A. Senior, V. Vanhoucke, P. Nguyen, T. Sainath, and B. Kingsbury, "Deep neural networks for acoustic modeling in speech recognition," IEEE Signal Processing Magazine, vol. 28, pp. 82-97, November 2012
- (2012) IEEE Signal Processing Magazine , vol.28 , pp. 82-97
- Hinton, G.¹ Deng, L.² Yu, D.³ Dahl, G.⁴ Mohamed, A.⁵ Jaitly, N.⁶ Senior, A.⁷ Vanhoucke, V.⁸ Nguyen, P.⁹ Sainath, T.¹⁰ Kingsbury, B.¹¹

13
- 33745211419
- Improvements to fMPE for discriminative training of features
- D. Povey, "Improvements to fMPE for discriminative training of features," in Proceedings INTERSPEECH. ISCA, 2005, pp. 2977-2980
- (2005) Proceedings INTERSPEECH. ISCA , pp. 2977-2980
- Povey, D.¹

14
- 33745216251
- Maximum mutual information SPLICE transform for seen and unseen conditions
- J. Droppo and A. Acero, "Maximum mutual information SPLICE transform for seen and unseen conditions," in Proceedings INTERSPEECH. ISCA, 2005, pp. 989-992
- (2005) Proceedings INTERSPEECH. ISCA , pp. 989-992
- Droppo, J.¹ Acero, A.²

15
- 44949102463
- Recent progress on the discriminative region-dependent transform for speech feature extraction
- B. Zhang, S. Matsoukas, and R. Schwartz, "Recent progress on the discriminative region-dependent transform for speech feature extraction," in Proceedings INTERSPEECH. ISCA, 2006. pp. 1573-1576
- (2006) Proceedings INTERSPEECH. ISCA , pp. 1573-1576
- Zhang, B.¹ Matsoukas, S.² Schwartz, R.³

16
- 84867593229
- Discriminative feature transforms using differenced maximum mutual information
- M. Delcroix, A. Ogawa, S. Watanabe, T. Nakatani, and A. Nakamura, "Discriminative feature transforms using differenced maximum mutual information," in Proceedings ICASSP. IEEE, 2012, pp. 4753-4756
- (2012) Proceedings ICASSP. IEEE , pp. 4753-4756
- Delcroix, M.¹ Ogawa, A.² Watanabe, S.³ Nakatani, T.⁴ Nakamura, A.⁵

17
- 33745219155
- Regularizing linear discriminant analysis for speech recognition
- H. Erdo?gan, "Regularizing linear discriminant analysis for speech recognition," in Proceedings INTERSPEECH. ISCA, 2005, pp. 3021-3024
- (2005) Proceedings INTERSPEECH. ISCA , pp. 3021-3024
- Erdogan, H.¹

18
- 44849090969
- Recognition and understanding of meetings the AMI and AMIDA projects
- S. Renals, T. Hain, and H. Bourlard, "Recognition and understanding of meetings the AMI and AMIDA projects," in Proceedings ASRU. IEEE, 2007, pp. 238-247
- (2007) Proceedings ASRU. IEEE , pp. 238-247
- Renals, S.¹ Hain, T.² Bourlard, H.³

19
- 84890479839
- The second 'CHiME' speech separation and recognition challenge: Datasets, tasks and baselines
- to appear
- E. Vincent, J. Barker, S. Watanabe, J. Le Roux, F. Nesta, and M. Matassoni, "The second 'CHiME' speech separation and recognition challenge: Datasets, tasks and baselines," in Proceedings ICASSP. IEEE, 2012, to appear
- (2012) Proceedings ICASSP. IEEE
- Vincent, E.¹ Barker, J.² Watanabe, S.³ Le Roux, J.⁴ Nesta, F.⁵ Matassoni, M.⁶

20
- 84858953642
- The Kaldi speech recognition toolkit
- D. Povey, A. Ghoshal, G. Boulianne, L. Burget, O. Glembek, N. Goel, M. Hannemann, M. Petr, Y. Qian, P. Schwarz, J. Silovsky, G. Stemmer, and K. Vesely, "The Kaldi speech recognition toolkit," in Proceedings ASRU. IEEE, 2011, pp. 1-4
- (2011) Proceedings ASRU. IEEE , pp. 1-4
- Povey, D.¹ Ghoshal, A.² Boulianne, G.³ Burget, L.⁴ Glembek, O.⁵ Goel, N.⁶ Hannemann, M.⁷ Petr, M.⁸ Qian, Y.⁹ Schwarz, P.¹⁰ Silovsky, J.¹¹ Stemmer, G.¹² Vesely, K.¹³

21
- 0003571976
- March
- S. Young, G. Evermann, M. Gales, T. Hain, D. Kershaw, X. Liu, G. Moore, J. Odell, D. Ollason, D. Povey, V. Valtchev, and P. Woodland, "The HTK Book (for HTK Version 3.4.1)," http://htk.eng.cam.ac.uk, March 2009
- (2009) The HTK Book (For HTK Version 3.4.1)
- Young, S.¹ Evermann, G.² Gales, M.³ Hain, T.⁴ Kershaw, D.⁵ Liu, X.⁶ Moore, G.⁷ Odell, J.⁸ Ollason, D.⁹ Povey, D.¹⁰ Valtchev, V.¹¹ Woodland, P.¹²

22
- 51449120120
- Boosted MMI for model and feature-space discriminative training
- D. Povey, D. Kanevsky, B. Kingsbury, B. Ramabhadran, G. Saon, and K. Visweswariah, "Boosted MMI for model and feature-space discriminative training," in Proceedings ICASSP. IEEE, 2008, pp. 4057-4060
- (2008) Proceedings ICASSP. IEEE , pp. 4057-4060
- Povey, D.¹ Kanevsky, D.² Kingsbury, B.³ Ramabhadran, B.⁴ Saon, G.⁵ Visweswariah, K.⁶

23
- 0034848926
- Tandem acoustic modeling in large-vocabulary recognition
- D. Ellis, R. Singh, and S. Sivadas, "Tandem acoustic modeling in large-vocabulary recognition," in Proceedings ICASSP. IEEE, 2001, pp. 517-520
- (2001) Proceedings ICASSP. IEEE , pp. 517-520
- Ellis, D.¹ Singh, R.² Sivadas, S.³

24
- 33645712892
- Compressed sensing
- April
- D.L. Donoho, "Compressed sensing," IEEE Transactions on Information Theory, vol. 52, pp. 1289-1306, April 2006
- (2006) IEEE Transactions on Information Theory , vol.52 , pp. 1289-1306
- Donoho, D.L.¹

25
- 0032762471
- A statistical model-based voice activity detection
- January
- J. Sohn, N.S. Kim, and W. Sung, "A statistical model-based voice activity detection," IEEE Signal Processing Letters, vol. 6, pp. 1-3, January 1999
- (1999) IEEE Signal Processing Letters , vol.6 , pp. 1-3
- Sohn, J.¹ Kim, N.S.² Sung, W.³

26
- 78650016939
- Underdetermined convolutive blind source separation via frequency bin-wise clustering and permutation alignment
- March
- H. Sawada, S. Araki, and S. Makino, "Underdetermined convolutive blind source separation via frequency bin-wise clustering and permutation alignment," IEEE Transactions on Audio, Speech, and Language Processing, vol. 19, pp. 516-527, March 2011.
- (2011) IEEE Transactions on Audio, Speech, and Language Processing , vol.19 , pp. 516-527
- Sawada, H.¹ Araki, S.² Makino, S.³

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.