SCOPUS 정보 검색 플랫폼

Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH

Volumn , Issue , 2013, Pages 3512-3516

Reverberant speech recognition based on denoising autoencoder

(5) Ishii, Takaaki a Komiyama, Hiroki a Shinozaki, Takahiro b Horiuchi, Yasuo a Kuroiwa, Shingo a

Author keywords

CENSREC 4; Denoising autoencoder; Distant talking speech recognition; Restricted boltzmann machine; Reverberant speech recognition

Indexed keywords

EXPERIMENTS; IMPULSE RESPONSE; LEARNING SYSTEMS; SPEECH RECOGNITION; STATISTICAL TESTS;

AUTO ENCODERS; CENSREC-4; CONVENTIONAL METHODS; EVALUATION FRAMEWORK; LARGE VOCABULARY SPEECH RECOGNITION; MULTI-CONDITION TRAININGS; RECOGNITION ACCURACY; RESTRICTED BOLTZMANN MACHINE;

REVERBERATION;

EID: 84906237188 PISSN: 2308457X EISSN: 19909772 Source Type: Conference Proceeding
DOI: None Document Type: Conference Paper

Times cited : (93)

References (15)

1
- 85032750883
- Microphone array processing for distant speech recognition: From close-talking microphones to far-field sensors
- IEEE
- K. Kumatani, J. McDonough, and B. Raj, "Microphone array processing for distant speech recognition: From close-talking microphones to far-field sensors, " Signal Processing Magazine, IEEE, vol. 29, no. 6, pp. 127-140, 2012.
- (2012) Signal Processing Magazine , vol.29 , Issue.6 , pp. 127-140
- Kumatani, K.¹ McDonough, J.² Raj, B.³

2
- 65249167097
- Suppression of late reverberation effect on speech signal using long- 3515 term multiple-step linear prediction
- may
- K. Kinoshita, M. Delcroix, T. Nakatani, and M. Miyoshi, "Suppression of late reverberation effect on speech signal using long- 3515 term multiple-step linear prediction, " Audio, Speech, and Language Processing, IEEE Transactions on, vol. 17, no. 4, pp. 534 -545, may 2009.
- (2009) Audio, Speech, and Language Processing, IEEE Transactions on , vol.17 , Issue.4 , pp. 534-545
- Kinoshita, K.¹ Delcroix, M.² Nakatani, T.³ Miyoshi, M.⁴

3
- 77956752049
- Dynamic features in the linear-logarithmic hybrid domain for automatic speech recognition in a reverberant environment
- O. Ichikawa, T. Fukuda, and M. Nishimura, "Dynamic features in the linear-logarithmic hybrid domain for automatic speech recognition in a reverberant environment, " Selected Topics in Signal Processing, IEEE Journal of, vol. 4, pp. 816-823, 2010.
- (2010) Selected Topics in Signal Processing, IEEE Journal of , vol.4 , pp. 816-823
- Ichikawa, O.¹ Fukuda, T.² Nishimura, M.³

4
- 33746600649
- Reducing the dimensionality of data with neural networks
- G. Hinton and R. Salakhutdinov, "Reducing the dimensionality of data with neural networks, " Science, vol. 313, no. 5786, pp. 504- 507, 2006.
- (2006) Science , vol.313 , Issue.5786 , pp. 504-507
- Hinton, G.¹ Salakhutdinov, R.²

5
- 79959842828
- Binary coding of speech spectrograms using a deep autoencoder
- L. Deng, M. Seltzer, D. Yu, A. Acero, A. Mohamed, and G. Hinton, "Binary coding of speech spectrograms using a deep autoencoder, " in Proc. Interspeech, 2010, pp. 1692-1695.
- (2010) Proc. Interspeech , pp. 1692-1695
- Deng, L.¹ Seltzer, M.² Yu, D.³ Acero, A.⁴ Mohamed, A.⁵ Hinton, G.⁶

6
- 79551480483
- Stacked denoising autoencoders: Learning useful representations in a deep network with a local denoising criterion
- Dec
- P. Vincent, H. Larochelle, I. Lajoie, Y. Bengio, and P. A. Manzagol, "Stacked denoising autoencoders: Learning useful representations in a deep network with a local denoising criterion, " J. Mach. Learn. Res., vol. 11, pp. 3371-3408, Dec. 2010.
- (2010) J. Mach. Learn. Res. , vol.11 , pp. 3371-3408
- Vincent, P.¹ Larochelle, H.² Lajoie, I.³ Bengio, Y.⁴ Manzagol, P.A.⁵

7
- 84878409063
- Recurrent neural networks for noise reduction in robust asr
- A. Maas, Q. Le, T. O'Neil, O. Vinyals, P. Nguyen, and A. Ng, "Recurrent neural networks for noise reduction in robust asr, " in Proceedings of INTERSPEECH, 2012.
- (2012) Proceedings of INTERSPEECH
- Maas, A.¹ Le, Q.² O'Neil, T.³ Vinyals, O.⁴ Nguyen, P.⁵ Ng, A.⁶

8
- 84878421481
- Speech restoration based on deep learning autoencoder with layer-wised pretraining
- X. Lu and H. K. S. Matsuda, C. Hori, "Speech restoration based on deep learning autoencoder with layer-wised pretraining, " in the International Speech Communication Association, 2012.
- (2012) The International Speech Communication Association
- Lu, X.¹ Matsuda, H.K.S.² Hori, C.³

9
- 85019794732
- Evaluation framework for distant-talking speech recognition under reverberant environments -newest part of the censrec series
- May
- T. Nishiura, M. Nakayama, Y. Denda, N. Kitaoka, K. Yamamoto, T. Yamada, S. Tsuge, C. Miyajima, M. Fujimoto, T. Takiguchi, S. Tamura, S. Kuroiwa, K. Takeda, and S. Nakamura, "Evaluation framework for distant-talking speech recognition under reverberant environments -newest part of the censrec series-, " Proc. LREC'08, May 2008.
- (2008) Proc. LREC'08
- Nishiura, T.¹ Nakayama, M.² Denda, Y.³ Kitaoka, N.⁴ Yamamoto, K.⁵ Yamada, T.⁶ Tsuge, S.⁷ Miyajima, C.⁸ Fujimoto, M.⁹ Takiguchi, T.¹⁰ Tamura, S.¹¹ Kuroiwa, S.¹² Takeda, K.¹³ Nakamura, S.¹⁴

10
- 0032644224
- JNAS: Japanese speech corpus for large vocabulary continuous speech recognition research
- K. Itou, M. Yamamoto, K. Takeda, T. Takezawa, T. Matsuoka, T. Kobayashi, K. Shikano, and S. Itahashi, "JNAS: Japanese speech corpus for large vocabulary continuous speech recognition research, " Acoust Soc Jpn E, vol. 20, no. 3, pp. 199-206, 1999.
- (1999) Acoust Soc Jpn E , vol.20 , Issue.3 , pp. 199-206
- Itou, K.¹ Yamamoto, M.² Takeda, K.³ Takezawa, T.⁴ Matsuoka, T.⁵ Kobayashi, T.⁶ Shikano, K.⁷ Itahashi, S.⁸

11
- 68049138790
- Training products of experts by minimizing contrastive divergence
- G. Hinton, "Training products of experts by minimizing contrastive divergence, " Neural Computation, vol. 14, p. 2002, 2000.
- (2000) Neural Computation , vol.14 , pp. 2002
- Hinton, G.¹

12
- 84055211743
- Acoustic modeling using deep belief networks
- A. Mohamed, G. E. Dahl, and G. Hinton, "Acoustic modeling using deep belief networks, " Audio, Speech, and Language Processing, IEEE Transactions on, vol. 20, no. 1, pp. 14-22, 2012.
- (2012) Audio, Speech, and Language Processing, IEEE Transactions on , vol.20 , Issue.1 , pp. 14-22
- Mohamed, A.¹ Dahl, G.E.² Hinton, G.³

13
- 0019053271
- Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences
- S. B. Davis and P. Mermelstein, "Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences, " IEEE Transaction on Acoustic Speech and Singal Processing, vol. 28, no. 4, pp. 357-366, 1980.
- (1980) IEEE Transaction on Acoustic Speech and Singal Processing , vol.28 , Issue.4 , pp. 357-366
- Davis, S.B.¹ Mermelstein, P.²

14
- 0000135303
- Methods of conjugate gradients for solving linear systems
- Dec
- M. R. Hestenes and E. Stiefel, "Methods of Conjugate Gradients for Solving Linear Systems, " Journal of Research of the National Bureau of Standards, vol. 49, no. 6, pp. 409-436, Dec. 1952.
- (1952) Journal of Research of the National Bureau of Standards , vol.49 , Issue.6 , pp. 409-436
- Hestenes, M.R.¹ Stiefel, E.²

15
- 44849131087
- The titech large vocabulary wfst speech recognition system
- P. R. Dixon, D. A. Caseiro, T. Oonishi, and S. Furui, "The titech large vocabulary wfst speech recognition system, " in Proc. IEEE ASRU, 2007, pp. 443-448.
- (2007) Proc. IEEE ASRU , pp. 443-448
- Dixon, P.R.¹ Caseiro, D.A.² Oonishi, T.³ Furui, S.⁴

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.