SCOPUS 정보 검색 플랫폼

Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH

Volumn , Issue , 2014, Pages 2670-2674

Dynamic noise aware training for speech enhancement based on deep neural networks

(4) Xu, Yong a Du, Jun a Dai, Li Rong a Lee, Chin Hui b

a UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA (China)

b GEORGIA INSTITUTE OF TECHNOLOGY (United States)

Author keywords

Deep neural networks; Ideal binary mask; Noise aware training; Non stationary noise; Speech enhancement

Indexed keywords

ALGORITHMS; MEAN SQUARE ERROR; QUALITY CONTROL; SIGNAL TO NOISE RATIO; SPEECH; SPEECH COMMUNICATION;

DEEP NEURAL NETWORKS; GENERALIZATION CAPABILITY; IDEAL BINARY MASK; MINIMUM MEAN SQUARED ERROR; NONSTATIONARY NOISE; PERCEPTUAL EVALUATION OF SPEECH QUALITIES; SPEECH ENHANCEMENT METHODS; STATE-OF-THE-ART TECHNIQUES;

SPEECH ENHANCEMENT;

EID: 84910038203 PISSN: 2308457X EISSN: 19909772 Source Type: Conference Proceeding
DOI: None Document Type: Conference Paper

Times cited : (100)

References (33)

1
- 33745146930
- Springer
- J. Benesty, S. Makino, and J. D. Chen, Speech Enhancement, Springer, 2005.
- (2005) Speech Enhancement
- Benesty, J.¹ Makino, S.² Chen, J.D.³

2
- 0018455310
- Suppression of acoustic noise in speech using spectral subtraction
- S. Boll, "Suppression of acoustic noise in speech using spectral subtraction, " IEEE Trans. on Acoustic, Speech and Signal Processing, Vol. 27, No. 2, pp. 113-120, 1979.
- (1979) IEEE Trans. on Acoustic, Speech and Signal Processing , vol.27 , Issue.2 , pp. 113-120
- Boll, S.¹

3
- 0018642851
- Enhancement and bandwidth compression of noisy speech
- J. S. Lim and A. V. Oppenheim, "Enhancement and bandwidth compression of noisy speech, " in Proc. IEEE, Vol. 67, No. 12, pp. 1586-1604, 1979.
- (1979) Proc. IEEE , vol.67 , Issue.12 , pp. 1586-1604
- Lim, J.S.¹ Oppenheim, A.V.²

4
- 0021645331
- Speech enhancement using a minimum-mean square error short-time spectral amplitude estimator
- Y. Ephraim and D. Malah, "Speech enhancement using a minimum-mean square error short-time spectral amplitude estimator, " IEEE Trans. on Acoustics, Speech and Signal Processing, Vol. 32, No.6, pp. 1109-1121, 1984.
- (1984) IEEE Trans. on Acoustics, Speech and Signal Processing , vol.32 , Issue.6 , pp. 1109-1121
- Ephraim, Y.¹ Malah, D.²

5
- 0021892216
- Speech enhancement using minimu mean square log spectral amplitude estimator
- Y. Ephraim and D. Malah, "Speech enhancement using minimu mean square log spectral amplitude estimator, " IEEE Trans. on Acoustics, Speech and Signal Processing, Vol. 33, No. 2, pp. 443- 445, 1985.
- (1985) IEEE Trans. on Acoustics, Speech and Signal Processing , vol.33 , Issue.2 , pp. 443-445
- Ephraim, Y.¹ Malah, D.²

6
- 0035500783
- Speech enhancement for non- stationary noise environments
- I. Cohen and B. Berdugo, "Speech enhancement for non- stationary noise environments, " Signal Processing, Vol. 81, No. 11, pp. 2403-2418, 2001.
- (2001) Signal Processing , vol.81 , Issue.11 , pp. 2403-2418
- Cohen, I.¹ Berdugo, B.²

7
- 0041360463
- Noise spectrum estimation in adverse environments: Improved minima controlled recursive averaging
- I. Cohen, "Noise spectrum estimation in adverse environments: Improved minima controlled recursive averaging, " IEEE Trans. on Speech and Audio Processing, Vol. 11, No. 5, pp. 466-475, 2003.
- (2003) IEEE Trans. on Speech and Audio Processing , vol.11 , Issue.5 , pp. 466-475
- Cohen, I.¹

8
- 0009766947
- Networks for speech enhancement
- Edited by Shigeru Katagiri, Artech House, Boston
- E. A.Wan and A. T. Nelson, "Networks for speech enhancement, " in Handbook of Neural Networks for Speech Processing, Edited by Shigeru Katagiri, Artech House, Boston, 1998.
- (1998) Handbook of Neural Networks for Speech Processing
- Wan, E.A.¹ Nelson, A.T.²

9
- 69349090197
- Learning deep architectures for AI
- Y. Bengio, "Learning deep architectures for AI, " Foundations and Trends in Machine Learning, Vol. 2, No. 1, pp. 1-127, 2009.
- (2009) Foundations and Trends in Machine Learning , vol.2 , Issue.1 , pp. 1-127
- Bengio, Y.¹

10
- 33746600649
- Reducing the dimensionality of data with neural networks
- G. E. Hinton and R. R. Salakhutdinov, "Reducing the dimensionality of data with neural networks, " Science, Vol. 313, No. 5786, pp. 504-507, 2006.
- (2006) Science , vol.313 , Issue.5786 , pp. 504-507
- Hinton, G.E.¹ Salakhutdinov, R.R.²

11
- 84900542109
- Recurrent neural network feature enhancement: The 2nd chime challenge
- A. L. Maas, T. M. O'Neil, A. Y. Hannun and A. Y. Ng, "Recurrent neural network feature enhancement: The 2nd CHiME challenge, " In Proceedings The 2nd CHiME Workshop on Machine Listening in Multisource Environments held in conjunction with ICASSP, pp. 79-80, 2013.
- (2013) Proceedings the 2nd CHiME Workshop on Machine Listening in Multisource Environments Held in Conjunction with ICASSP , pp. 79-80
- Maas, A.L.¹ O'Neil, T.M.² Hannun, A.Y.³ Ng, A.Y.⁴

12
- 84878409063
- Recurrent neural networks for noise reduction in robust ASR
- A. L. Maas, Q. V. Le, T. M. O'Neil, O. Vinyals, P. Nguyen and A. Y. Ng, "Recurrent Neural Networks for Noise Reduction in Robust ASR, " Proc. Interspeech, pp. 22-25, 2012.
- (2012) Proc. Interspeech , pp. 22-25
- Maas, A.L.¹ Le, Q.V.² O'Neil, T.M.³ Vinyals, O.⁴ Nguyen, P.⁵ Ng, A.Y.⁶

13
- 84889257121
- An experimental study on speech enhancement based on deep neural networks
- Y. Xu, J. Du, L.-R. Dai and C.-H. Lee, "An experimental study on speech enhancement based on deep neural networks, " IEEE Signal Processing Letters, Vol. 21, No. 1, pp. 65-68, 2014.
- (2014) IEEE Signal Processing Letters , vol.21 , Issue.1 , pp. 65-68
- Xu, Y.¹ Du, J.² Dai, L.-R.³ Lee, C.-H.⁴

14
- 84896537574
- Wiener filtering based speech enhancement with weighted denoising auto-encoder and noise classification
- B.-Y. Xia and C.-C. Bao, "Wiener filtering based speech enhancement with weighted denoising auto-encoder and noise classification, " Speech Communication, Vol. 60, pp. 13-29, 2014.
- (2014) Speech Communication , vol.60 , pp. 13-29
- Xia, B.-Y.¹ Bao, C.-C.²

15
- 84906279378
- Speech enhancement with weighted de- noising auto-encoder
- B.-Y. Xia and C.-C. Bao, "Speech enhancement with weighted de- noising Auto-Encoder, " Proc. Interspeech, pp. 3444-3448, 2013.
- (2013) Proc. Interspeech , pp. 3444-3448
- Xia, B.-Y.¹ Bao, C.-C.²

16
- 84906262433
- Speech enhancement based on deep denoising auto-encoder
- X.-G. Lu and Y. Tsao and S. Matsuda and C. Hori, "Speech enhancement based on deep denoising Auto-Encoder, " Proc. Inter- speech, pp. 436-440, 2013.
- (2013) Proc. Inter- Speech , pp. 436-440
- Lu, X.-G.¹ Tsao, Y.² Matsuda, S.³ Hori, C.⁴

17
- 84875678689
- Towards scaling up classification- based speech separation
- Y. X. Wang and D. L. Wang, "Towards scaling up classification- based speech separation, " IEEE Trans. on Audio, Speech and Lan- guage Processing, Vol. 21, No. 7, pp. 1381-1390, 2013.
- (2013) IEEE Trans. on Audio, Speech and Language Processing , vol.21 , Issue.7 , pp. 1381-1390
- Wang, Y.X.¹ Wang, D.L.²

18
- 82255178542
- Hoboken, NJ, USA: Wiley-IEEE Press
- D. L. Wang and G. Brown, Eds., Computational Auditory Scene Analysis: Principles, Algorithms and Applications. Hoboken, NJ, USA: Wiley-IEEE Press, 2006.
- (2006) Computational Auditory Scene Analysis: Principles, Algorithms and Applications
- Wang, D.L.¹ Brown, G.²

19
- 29444448046
- A noise-estimation algorithm for highly non-stationary environments
- S. Rangachari and P. C. Loizou, "A noise-estimation algorithm for highly non-stationary environments, " Speech Communication, Vol. 48, No. 2, pp. 220-231, 2006.
- (2006) Speech Communication , vol.48 , Issue.2 , pp. 220-231
- Rangachari, S.¹ Loizou, P.C.²

20
- 84890492030
- An investigation of deep neural networks for noise robust speech recognition
- M. Seltzer, D. Yu and Y. Wang, "An investigation of deep neural networks for noise robust speech recognition, " Proc. ICASSP, pp. 7398-7402, 2013.
- (2013) Proc. ICASSP , pp. 7398-7402
- Seltzer, M.¹ Yu, D.² Wang, Y.³

21
- 84890452886
- Fast speaker adaptation of hybrid NN/HMM model for speech recognition based on discriminative learning of speaker code
- O. Abdel-Hamid and H. Jiang, "Fast speaker adaptation of hybrid NN/HMM model for speech recognition based on discriminative learning of speaker code, " Proc. ICASSP, pp. 7942-7946, 2013.
- (2013) Proc. ICASSP , pp. 7942-7946
- Abdel-Hamid, O.¹ Jiang, H.²

22
- 84857498666
- Unbiased MMSE-based noise power estimation with low complexity and low tracking delay
- T. Gerkmann, R. C. Hendriks, "Unbiased MMSE-based noise power estimation with low complexity and low tracking delay, " IEEE Trans. on Audio, Speech, and Language Processing, Vol. 20, No. 4, pp. 1383-1393, 2012.
- (2012) IEEE Trans. on Audio, Speech, and Language Processing , vol.20 , Issue.4 , pp. 1383-1393
- Gerkmann, T.¹ Hendriks, R.C.²

23
- 78049364397
- MMSE based noise PSD tracking with low complexity
- R. C. Hendriks, R. Heusdens, and J. Jensen, "MMSE based noise PSD tracking with low complexity, " Proc. ICASSP, pp. 4266- 4269, 2010.
- (2010) Proc. ICASSP , pp. 4266-4269
- Hendriks, R.C.¹ Heusdens, R.² Jensen, J.³

24
- 47949104834
- Speech enhancement based on generalized minimum mean square error estimators and masking properties of the auditory system
- J. H. Hansen, V. Radhakrishnan and K. H. Arehart, "Speech enhancement based on generalized minimum mean square error estimators and masking properties of the auditory system, " IEEE trans. on Audio, Speech and Language Processing, Vol. 14, No. 6, pp. 2049-2063, 2006.
- (2006) IEEE Trans. on Audio, Speech and Language Processing , vol.14 , Issue.6 , pp. 2049-2063
- Hansen, J.H.¹ Radhakrishnan, V.² Arehart, K.H.³

25
- 0027623210
- Assessment for automatic speech recognition: II. NOISEX-92: A database and an experiment to study the effect of additive noise on speech recognition systems
- A. Varga and H. J. M. Steeneken, "Assessment for automatic speech recognition: II. NOISEX-92: A database and an experi- ment to study the effect of additive noise on speech recognition systems, " Speech Communication, Vol. 12, No. 3, pp. 247-251, 1993.
- (1993) Speech Communication , vol.12 , Issue.3 , pp. 247-251
- Varga, A.¹ Steeneken, H.J.M.²

26
- 84867202951
- A speech enhancement approach using piece- wise linear approximation of an explicit model of environmental distortions
- J. Du and Q. Huo, "A speech enhancement approach using piece- wise linear approximation of an explicit model of environmental distortions, " Proc. Interspeech, pp. 569-572, 2008.
- (2008) Proc. Interspeech , pp. 569-572
- Du, J.¹ Huo, Q.²

27
- 79959842828
- Binary coding of speech spectrograms using a deep auto-encoder
- L. Deng, M. L. Seltzer, D. Yu, A. Acero, A. R. Mohamed and G. E. Hinton. "Binary coding of speech spectrograms using a deep auto-encoder, " Proc. Interspeech, pp. 1692-1695, 2010.
- (2010) Proc. Interspeech , pp. 1692-1695
- Deng, L.¹ Seltzer, M.L.² Yu, D.³ Acero, A.⁴ Mohamed, A.R.⁵ Hinton, G.E.⁶

28
- 84910077907
- On training targets for supervised speech separation
- The Ohio State University
- Y. X. Wang, A. Narayanan and D. L. Wang, "On training targets for supervised speech separation, " Technical Report OSU-CISRC- 2/14-TR05, Department of Computer Science and Engineering, The Ohio State University, 2014.
- (2014) Technical Report Department of Computer Science and Engineering
- Wang, Y.X.¹ Narayanan, A.² Wang, D.L.³

29
- 84870477511
- Exploring monaural features for classification-based speech segregation
- Y. X. Wang, K. Han and D. L. Wang, "Exploring monaural features for classification-based speech segregation, " IEEE Trans. on Audio, Speech, and Language Processing, Vo. 21, No. 2, pp. 270- 279, 2013.
- (2013) IEEE Trans. on Audio, Speech, and Language Processing , vol.21 , Issue.2 , pp. 270-279
- Wang, Y.X.¹ Han, K.² Wang, D.L.³

30
- 0038669544
- The AURORA experimental frame- work for the preformance evaluations of speech recognition systems under noisy conditions
- H. G. Hirsch and D. Pearce, "The AURORA experimental frame- work for the preformance evaluations of speech recognition systems under noisy conditions, " Proc. ISCA ITRW ASR, pp. 181- 188, 2000.
- (2000) Proc. ISCA ITRW ASR , pp. 181-188
- Hirsch, H.G.¹ Pearce, D.²

31
- 0003419545
- Getting started with the DARPA TIMIT CD-ROM: An acoustic phonetic continuous speech database
- J. S. Garofolo, Getting started with the DARPA TIMIT CD-ROM: An acoustic phonetic continuous speech database, NIST Tech Report, 1988.
- (1988) NIST Tech Report
- Garofolo, J.S.¹

32
- 85014384841
- Perceptual evaluation of speech quality (PESQ): An objective method for end-to-end speech quality assessment of narrow-band telephone network- s and speech codecs
- Recommendation
- ITU-T, Recommendation P.862, "Perceptual evaluation of speech quality (PESQ): An objective method for end-to-end speech quality assessment of narrow-band telephone network- s and speech codecs, " International Telecommunication Union- Telecommunication Standardisation Sector, 2001.
- (2001) International Telecommunication Union- Telecommunication Standardisation Sector
- ITU-T¹

33
- 84890467815
- G. Hu, 100 nonspeech environmental sounds, 2004. http://www.cse.ohio-state.edu/pnl/corpus/HuCorpus.html.
- (2004) 100 Nonspeech Environmental Sounds
- Hu, G.¹

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.