SCOPUS 정보 검색 플랫폼

ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings

Volumn , Issue , 2014, Pages 5527-5531

Impact of single-microphone dereverberation on DNN-based meeting transcription systems

(3) Yoshioka, Takuya a,b Chen, Xie a Gales, Mark J F a

a UNIVERSITY OF CAMBRIDGE (United Kingdom)

b NTT Communication Science Laboratories (Japan)

Author keywords

deep neural network; Environmental robustness; meeting transcription; reverberation; single distant microphone

Indexed keywords

MICROPHONES; REVERBERATION; SPEECH RECOGNITION; TRANSCRIPTION;

ACOUSTIC MODEL; AUTOMATIC SPEECH RECOGNITION SYSTEM; DEEP NEURAL NETWORKS; DEREVERBERATION; ENVIRONMENTAL ROBUSTNESS; FEATURE VECTORS; STATE-OF-THE-ART SYSTEM;

SIGNAL PROCESSING;

EID: 84905247922 PISSN: 15206149 EISSN: None Source Type: Conference Proceeding
DOI: 10.1109/ICASSP.2014.6854660 Document Type: Conference Paper

Times cited : (22)

References (21)

1
- 0028194709
- Connectionist probability estimators in HMM speech recognition
- S. Renals, N. Morgan, H. Bourlard, M. Cohen, and H. Franco, "Connectionist probability estimators in HMM speech recognition," IEEE Trans. Speech, Audio Process., vol. 2, no. 1, pp. 161-174, 1994.
- (1994) IEEE Trans. Speech, Audio Process , vol.2 , Issue.1 , pp. 161-174
- Renals, S.¹ Morgan, N.² Bourlard, H.³ Cohen, M.⁴ Franco, H.⁵

2
- 84055222005
- Context-dependent pre-trained deep neural networks for large-vocabulary speech recognition
- G. E. Dahl, D. Yu, L. Deng, and A. Acero, "Context-dependent pre-trained deep neural networks for large-vocabulary speech recognition," IEEE Trans. Audio, Speech, Language Process., vol. 20, no. 1, pp. 30-42, 2012.
- (2012) IEEE Trans. Audio, Speech, Language Process , vol.20 , Issue.1 , pp. 30-42
- Dahl, G.E.¹ Yu, D.² Deng, L.³ Acero, A.⁴

3
- 80051608940
- Robust speech recognition using dynamic noise adaptation
- S. Rennie, P. Dognin, and P. Fousek, "Robust speech recognition using dynamic noise adaptation," in Proc. Int. Conf. Acoust., Speech, Signal Process., 2011, pp. 4592-4595.
- (2011) Proc. Int. Conf. Acoust., Speech, Signal Process , pp. 4592-4595
- Rennie, S.¹ Dognin, P.² Fousek, P.³

4
- 33947677142
- Dynamic noise adaptation
- S. Rennie, T. Kristjansson, P. Olsen, and R. Gopinath, "Dynamic noise adaptation," in Proc. Int. Conf. Acoust., Speech, Signal Process., 2006, pp. 1197-1200.
- (2006) Proc. Int. Conf. Acoust., Speech, Signal Process , pp. 1197-1200
- Rennie, S.¹ Kristjansson, T.² Olsen, P.³ Gopinath, R.⁴

5
- 33646788786
- FMPE: Discriminatively trained features for speech recognition
- D. Povey, B. Kingsbury, L. Mangu, G. Saon, H. Soltau, and G. Zweig, "FMPE: Discriminatively trained features for speech recognition," in Proc. Int. Conf. Acoust., Speech, Signal Process., 2005, pp. 961-964.
- (2005) Proc. Int. Conf. Acoust., Speech, Signal Process , pp. 961-964
- Povey, D.¹ Kingsbury, B.² Mangu, L.³ Saon, G.⁴ Soltau, H.⁵ Zweig, G.⁶

6
- 0032050110
- Maximum likelihood linear transformations for HMM-based speech recognition
- M. J. F. Gales, "Maximum likelihood linear transformations for HMM-based speech recognition," Comp. Speech, Language, vol. 12, no. 2, pp. 75-98, 1998.
- (1998) Comp. Speech, Language , vol.12 , Issue.2 , pp. 75-98
- Gales, M.J.F.¹

7
- 84890492030
- An investigation of deep neural networks for noise robust speech recognition
- M. L. Seltzer, D. Yu, and Y. Wang, "An investigation of deep neural networks for noise robust speech recognition," in Proc. Int. Conf. Acoust., Speech, Signal Process., 2013, pp. 7398-7402.
- (2013) Proc. Int. Conf. Acoust., Speech, Signal Process , pp. 7398-7402
- Seltzer, M.L.¹ Yu, D.² Wang, Y.³

8
- 66149101303
- Robust speech recognition using a cepstral minimum-meansquare-error-motivated noise suppressor
- D. Yu, L. Deng, J. Droppo, J. Wu, Y. Gong, and A. Acero, "Robust speech recognition using a cepstral minimum-meansquare-error-motivated noise suppressor," IEEE Trans. Audio, Speech, Language Process., vol. 16, no. 5, pp. 1061-1070, 2008.
- (2008) IEEE Trans. Audio, Speech, Language Process , vol.16 , Issue.5 , pp. 1061-1070
- Yu, D.¹ Deng, L.² Droppo, J.³ Wu, J.⁴ Gong, Y.⁵ Acero, A.⁶

9
- 84890532503
- Noise adaptive front-end normalization based on vector Taylor series for deep neural networks in robust speech recognition
- B. Li and K. C. Sim, "Noise adaptive front-end normalization based on vector Taylor series for deep neural networks in robust speech recognition," in Proc. Int. Conf. Acoust., Speech, Signal Process., 2013, pp. 7408-7412.
- (2013) Proc. Int. Conf. Acoust., Speech, Signal Process , pp. 7408-7412
- Li, B.¹ Sim, K.C.²

10
- 33745530242
- The AMI meeting corpus: A pre-announcement
- J. Carletta, S. Ashby, S. Bourban, M. Flynn, M. Guillemot, T. Hain, J. Kadlec, V. Karaiskos, W. Kraaij, M. Kronenthal, G. Lathoud, M. Lincoln, A. Lisowska, I. McCowan, W. Post andD. Reidsma, and P. Wellner, "The AMI meeting corpus: a pre-announcement," in Proceedings of Int. Worksh. Machine Learning for Multimodal Interaction, 2006, pp. 28-39.
- (2006) Proceedings of Int. Worksh. Machine Learning for Multimodal Interaction , pp. 28-39
- Carletta, J.¹ Ashby, S.² Bourban, S.³ Flynn, M.⁴ Guillemot, M.⁵ Hain, T.⁶ Kadlec, J.⁷ Karaiskos, V.⁸ Kraaij, W.⁹ Kronenthal, M.¹⁰ Lathoud, G.¹¹ Lincoln, M.¹² Lisowska, A.¹³ McCowan, I.¹⁴ Post, W.¹⁵ Reidsma, D.¹⁶ Wellner, P.¹⁷

11
- 85032751613
- Making machines understand us in reverberant rooms: Robustness against reverberation for automatic speech recognition
- T. Yoshioka, A. Sehr, M. Delcroix, K. Kinoshita, R. Maas, T. Nakatani, and W. Kellermann, "Making machines understand us in reverberant rooms: Robustness against reverberation for automatic speech recognition," IEEE Signal Process. Mag., vol. 29, no. 6, pp. 114-126, 2012.
- (2012) IEEE Signal Process. Mag , vol.29 , Issue.6 , pp. 114-126
- Yoshioka, T.¹ Sehr, A.² Delcroix, M.³ Kinoshita, K.⁴ Maas, R.⁵ Nakatani, T.⁶ Kellermann, W.⁷

12
- 51449121832
- Blind speech dereverberation with multi-channel linear prediction based on short time Fourier transform representation
- T. Nakatani, T. Yoshioka, K. Kinoshita, M. Miyoshi, and B.-H. Juang, "Blind speech dereverberation with multi-channel linear prediction based on short time Fourier transform representation," in Proc. Int. Conf. Acoust., Speech, Signal Process., 2008, pp. 85-88.
- (2008) Proc. Int. Conf. Acoust., Speech, Signal Process , pp. 85-88
- Nakatani, T.¹ Yoshioka, T.² Kinoshita, K.³ Miyoshi, M.⁴ Juang, B.-H.⁵

13
- 84865727926
- Integrated online speaker clustering and adaptation
- C. Breslin, KK Chen, M. J. F. Gales, and K. Knill, "Integrated online speaker clustering and adaptation," in Proc. Interspeech, 2011, pp. 1085-1088.
- (2011) Proc. Interspeech , pp. 1085-1088
- Breslin, C.¹ Chen, K.K.² Gales, M.J.F.³ Knill, K.⁴

14
- 84867693894
- Generalization of multi-channel linear prediction methods for blind MIMO impulse response shortening
- T. Yoshioka and T. Nakatani, "Generalization of multi-channel linear prediction methods for blind MIMO impulse response shortening," IEEE Trans. Audio, Speech, Language Process., vol. 20, no. 10, pp. 2707-2720, 2012.
- (2012) IEEE Trans. Audio, Speech, Language Process , vol.20 , Issue.10 , pp. 2707-2720
- Yoshioka, T.¹ Nakatani, T.²

15
- 50449086237
- Acoustic beamforming for speaker diarization of meetings
- X. Anguera, C. Wooters, and J. Hernando, "Acoustic beamforming for speaker diarization of meetings," IEEE Trans. Audio, Speech, Language Process., vol. 15, no. 7, pp. 2011-2022, 2007.
- (2007) IEEE Trans. Audio, Speech, Language Process , vol.15 , Issue.7 , pp. 2011-2022
- Anguera, X.¹ Wooters, C.² Hernando, J.³

16
- 34547548235
- Probabilistic and bottle-neck features for LVCSR of meetings
- F. Grezl, M. Karafiat, S. Kontar, and J. Cernocky, "Probabilistic and bottle-neck features for LVCSR of meetings," in Proc. Int. Conf. Acoust., Speech, Signal Process., 2007, pp. IV-757-IV-760.
- (2007) Proc. Int. Conf. Acoust., Speech, Signal Process
- Grezl, F.¹ Karafiat, M.² Kontar, S.³ Cernocky, J.⁴

17
- 0032638856
- Semi-tied covariance matrices for hidden Markov models
- M. J. F. Gales, "Semi-tied covariance matrices for hidden Markov models," IEEE Trans. Speech, Audio Process., vol. 7, no. 3, pp. 272-281, 1999.
- (1999) IEEE Trans. Speech, Audio Process , vol.7 , Issue.3 , pp. 272-281
- Gales, M.J.F.¹

18
- 0032289099
- Heteroscedastic discriminant analysis and reduced rank HMMs for improved speech recognition
- N. Kumar and A. G. Andreou, "Heteroscedastic discriminant analysis and reduced rank HMMs for improved speech recognition," Speech Commun., vol. 26, no. 14, pp. 283-297, 1998.
- (1998) Speech Commun , vol.26 , Issue.14 , pp. 283-297
- Kumar, N.¹ Andreou, A.G.²

19
- 84858976070
- Feature engineering in context-depencent deep neural networks for conversational speech transcription
- F. Seide, G. Li, X. Chen, and D. Yu, "Feature engineering in context-depencent deep neural networks for conversational speech transcription," in Proc. Workshop. Automat. Speech Recognition, Understanding, 2011, pp. 24-29.
- (2011) Proc. Workshop. Automat. Speech Recognition, Understanding , pp. 24-29
- Seide, F.¹ Li, G.² Chen, X.³ Yu, D.⁴

20
- 84867585919
- Understanding how deep belief networks perform acoustic modelling
- A. Mohamed, G. Hinton, and G. Penn, "Understanding how deep belief networks perform acoustic modelling," in Proc. Int. Conf. Acoust., Speech, Signal Process., 2012, pp. 4273-4276.
- (2012) Proc. Int. Conf. Acoust., Speech, Signal Process , pp. 4273-4276
- Mohamed, A.¹ Hinton, G.² Penn, G.³

21
- 84893675167
- Model-based approaches to handling uncertainty
- D. Kolossa and R. Haeb-Umbach, Eds Springer
- M. J. F. Gales, "Model-based approaches to handling uncertainty," in Robust Speech Recognition of Uncertain or Missing Data, D. Kolossa and R. Haeb-Umbach, Eds., pp. 101-125. Springer, 2011.
- (2011) Robust Speech Recognition of Uncertain or Missing Data , pp. 101-125
- Gales, M.J.F.¹

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.