SCOPUS 정보 검색 플랫폼

IEEE/ACM Transactions on Audio Speech and Language Processing

Volumn 24, Issue 3, 2016, Pages 483-492

Complex ratio masking for monaural speech separation

(3) Williamson, Donald S a Wang, Yuxuan a,b Wang, DeLiang a

a The Ohio State University (United States)

b GOOGLE INC (United States)

Author keywords

Complex ideal ratio mask; Deep neural networks; Speech quality; Speech separation

Indexed keywords

COMPLEX NETWORKS; QUALITY CONTROL; SEPARATION; SOURCE SEPARATION; SPEECH ANALYSIS; SPEECH ENHANCEMENT;

DEEP NEURAL NETWORKS; IDEAL RATIO MASK; MAGNITUDE SPECTRUM; PERCEPTUAL EVALUATION OF SPEECH QUALITIES; PERCEPTUAL QUALITY; SHORT TIME FOURIER TRANSFORMS; SPEECH QUALITY; SPEECH SEPARATION;

SPEECH;

EID: 84962808663 PISSN: 23299290 EISSN: None Source Type: Journal
DOI: 10.1109/TASLP.2015.2512042 Document Type: Article

Times cited : (871)

References (35)

1
- 0020167383
- The unimportance of phase in speech enhancement
- Aug
- D. L. Wang, and J. S. Lim, "The unimportance of phase in speech enhancement, " IEEE Trans. Acoust. Speech Signal Process., ASSP-30, no. 4, pp. 679-681, Aug. 1982.
- (1982) IEEE Trans. Acoust. Speech Signal Process , vol.ASSP-30 , Issue.4 , pp. 679-681
- Wang, D.L.¹ Lim, J.S.²

2
- 0021645331
- Speech enhancement using a minimum mean-square error short-time spectral amplitude estimator
- Dec
- Y. Ephraim, and D. Malah, "Speech enhancement using a minimum mean-square error short-time spectral amplitude estimator, " IEEE Trans. Acoust. Speech Signal Process., ASSP-32, no. 6, pp. 1109-1121, Dec. 1984.
- (1984) IEEE Trans. Acoust. Speech Signal Process , vol.ASSP-32 , Issue.6 , pp. 1109-1121
- Ephraim, Y.¹ Malah, D.²

3
- 0019569248
- The importance of phase in signals
- May
- A. V. Oppenheim, J. S. Lim, "The importance of phase in signals, " Proc. IEEE, vol. 69, no. 5, pp. 529-541, May 1981.
- (1981) Proc. IEEE , vol.69 , Issue.5 , pp. 529-541
- Oppenheim, A.V.¹ Lim, J.S.²

4
- 79952363352
- The importance of phase in speech enhancement
- K. Paliwal, K. Wójcicki, and B. Shannon, "The importance of phase in speech enhancement, " Speech Commun., vol. 53, pp. 465-494, 2010.
- (2010) Speech Commun , vol.53 , pp. 465-494
- Paliwal, K.¹ Wójcicki, K.² Shannon, B.³

5
- 77949635098
- Iterative phase estimation for the synthesis of separated sources from single-channel mixtures
- May
- D. Gunawan, and D. Sen, "Iterative phase estimation for the synthesis of separated sources from single-channel mixtures, " IEEE Signal Process. Lett., vol. 17, no. 5, pp. 421-424, May 2010.
- (2010) IEEE Signal Process. Lett , vol.17 , Issue.5 , pp. 421-424
- Gunawan, D.¹ Sen, D.²

6
- 84887294721
- Phase estimation for signal reconstruction in single-channel speech separation
- P. Mowlaee, R. Saeidi, and R. Martin, "Phase estimation for signal reconstruction in single-channel speech separation, " Proc. Interspeech, 2012, pp. 1-4.
- (2012) Proc. Interspeech , pp. 1-4
- Mowlaee, P.¹ Saeidi, R.² Martin, R.³

7
- 84921800494
- STFT phase reconstruction in voiced speech for an improved single-channel speech enhancement
- Dec
- M. Krawczyk, and T. Gerkmann, "STFT phase reconstruction in voiced speech for an improved single-channel speech enhancement, " IEEE/ACM Trans. Audio Speech Lang Process., vol. 22, no. 12, pp. 1931-1940, Dec. 2014.
- (2014) IEEE/ACM Trans. Audio Speech Lang Process , vol.22 , Issue.12 , pp. 1931-1940
- Krawczyk, M.¹ Gerkmann, T.²

8
- 70349093614
- An algorithm that improves speech intelligibility in noise for normal-hearing listeners
- G. Kim, Y. Lu, Y. Hu, and P. Loizou, "An algorithm that improves speech intelligibility in noise for normal-hearing listeners, " J. Acoust. Soc. Amer., vol. 126, pp. 1486-1494, 2009.
- (2009) J. Acoust. Soc. Amer , vol.126 , pp. 1486-1494
- Kim, G.¹ Lu, Y.² Hu, Y.³ Loizou, P.⁴

9
- 84885412715
- An algorithm to improve speech recognition in noise for hearing-impaired listeners
- E. W. Healy, S. E. Yoho, Y. Wang, and D. L. Wang, "An algorithm to improve speech recognition in noise for hearing-impaired listeners, " J. Acoust. Soc. Amer., vol. 134, pp. 3029-3038, 2013.
- (2013) J. Acoust. Soc. Amer , vol.134 , pp. 3029-3038
- Healy, E.W.¹ Yoho, S.E.² Wang, Y.³ Wang, D.L.⁴

10
- 84890503044
- Phase randomization - A new paradigm for single-channel signal enhancement
- K. Sugiyama, and R. Miyahara, "Phase randomization-a new paradigm for single-channel signal enhancement, " Proc. ICASSP, 2013, pp. 7487- 7491.
- (2013) Proc. ICASSP , pp. 7487-7491
- Sugiyama, K.¹ Miyahara, R.²

11
- 84921740463
- On training targets for supervised speech separation
- Dec
- Y. Wang, A. Narayanan, and D. L. Wang, "On training targets for supervised speech separation, " IEEE/ACM Trans. Audio, Speech, Lang. Process., vol. 22, no. 12, pp. 1849-1858, Dec. 2014.
- (2014) IEEE/ACM Trans. Audio, Speech, Lang. Process , vol.22 , Issue.12 , pp. 1849-1858
- Wang, Y.¹ Narayanan, A.² Wang, D.L.³

12
- 84946080850
- Phase-sensitive and recognition-boosted speech separation using deep recurrent neural networks
- H. Erdogan, J. R. Hershey, S.Watanabe, and J. L. Roux, "Phase-sensitive and recognition-boosted speech separation using deep recurrent neural networks, " Proc. ICASSP, 2015, pp. 708-712.
- (2015) Proc. ICASSP , pp. 708-712
- Erdogan, H.¹ Hershey, J.R.² Watanabe, S.³ Roux, J.L.⁴

13
- 84946014781
- A deep neural network for time-domain signal reconstruction
- Y.Wang, and D. L.Wang, "A deep neural network for time-domain signal reconstruction, " Proc. ICASSP, 2015, pp. 4390-4394.
- (2015) Proc. ICASSP , pp. 4390-4394
- Wang, Y.¹ Wang, D.L.²

14
- 34447100796
- Boca Raton, FL, USA: CRC
- P. C. Loizou, Speech Enhancement: Theory and Practice, Boca Raton, FL, USA: CRC, 2007.
- (2007) Speech Enhancement: Theory and Practice
- Loizou, P.C.¹

15
- 84941336645
- Estimating nonnegative matrix model activations with deep neural networks to increase perceptual speech quality
- D. S. Williamson, Y. Wang, and D. L. Wang, "Estimating nonnegative matrix model activations with deep neural networks to increase perceptual speech quality, " J. Acoust. Soc. Amer., vol. 138, pp. 1399-1407, 2015.
- (2015) J. Acoust. Soc. Amer , vol.138 , pp. 1399-1407
- Williamson, D.S.¹ Wang, Y.² Wang, D.L.³

16
- 84870477511
- Exploring monaural features for classification-based speech segregation
- Feb
- Y. Wang, K. Han, and D. L. Wang, "Exploring monaural features for classification-based speech segregation, " IEEE Trans. Audio, Speech, Lang. Process., vol. 21, no. 2, pp. 270-279, Feb. 2013.
- (2013) IEEE Trans. Audio, Speech, Lang. Process , vol.21 , Issue.2 , pp. 270-279
- Wang, Y.¹ Han, K.² Wang, D.L.³

17
- 84910097441
- Boosted deep neural networks and multi-resolution cochleagram features for voice activity detection
- X.-L. Zhang, and D. L. Wang, "Boosted deep neural networks and multi-resolution cochleagram features for voice activity detection, " Proc. Interspeech, 2014, pp. 1534-1538.
- (2014) Proc. Interspeech , pp. 1534-1538
- Zhang, X.-L.¹ Wang, D.L.²

18
- 0031189914
- Multitask learning
- R. Caruana, "Multitask learning, " Mach. Learn., vol. 28, pp. 41-75, 1997.
- (1997) Mach. Learn , vol.28 , pp. 41-75
- Caruana, R.¹

19
- 84862294866
- Deep sparse rectifier neural networks
- X. Glorot, A. Bordes, and Y. Bengio, "Deep sparse rectifier neural networks, " Proc. AISTATS, 2011, vol. 15, pp. 315-323.
- (2011) Proc. AISTATS , vol.15 , pp. 315-323
- Glorot, X.¹ Bordes, A.² Bengio, Y.³

20
- 80052250414
- Adaptive subgradient methods for online learning and stochastic optimization
- J. Duchi, E. Hazan, and Y. Singer, "Adaptive subgradient methods for online learning and stochastic optimization, " J. Mach. Learn. Res., vol. 12, pp. 2121-2159, 2010.
- (2010) J. Mach. Learn. Res , vol.12 , pp. 2121-2159
- Duchi, J.¹ Hazan, E.² Singer, Y.³

21
- 0014568991
- IEEE recommended practice for speech quality measurements
- AE-17
- "IEEE recommended practice for speech quality measurements, " IEEE Trans. Audio Electroacoust., AE-17, pp. 225-246, 1969.
- (1969) IEEE Trans. Audio Electroacoust , pp. 225-246

22
- 0003548585
- J. S. Garofolo, L. F. Lamel, W. M. Fisher, J. G. Fiscus, D. S. Pallett, and N. L. Dahlgren, "DARPA TIMIT acoustic phonetic continuous speech corpus, " 1993, http://www.ldc.upenn.edu/Catalog/LDC93S1.html.
- (1993) DARPA TIMIT Acoustic Phonetic Continuous Speech Corpus
- Garofolo, J.S.¹ Lamel, L.F.² Fisher, W.M.³ Fiscus, J.G.⁴ Pallett, D.S.⁵ Dahlgren, N.L.⁶

23
- 42549139762
- MVA processing of speech features
- Jan
- C. Chen, and J. A. Bilmes, "MVA processing of speech features, " IEEE Trans. Audio, Speech, Lang. Process., vol. 15, no. 1, pp. 257-270, Jan. 2007.
- (2007) IEEE Trans. Audio, Speech, Lang. Process , vol.15 , Issue.1 , pp. 257-270
- Chen, C.¹ Bilmes, J.A.²

24
- 84921769616
- A feature study for classification-based speech separation at low signal-to-noise ratios
- Dec
- J. Chen, Y. Wang, and D. Wang, "A feature study for classification-based speech separation at low signal-to-noise ratios, " IEEE/ACMTrans. Audio, Speech, Lang Process., vol. 22, no. 12, pp. 2112-2121, Dec. 2014.
- (2014) IEEE/ACMTrans. Audio, Speech, Lang Process , vol.22 , Issue.12 , pp. 2112-2121
- Chen, J.¹ Wang, Y.² Wang, D.³

25
- 34547515100
- Incorporating phase information for source separation via spectrogram factorization
- R. M. Parry, and I. Essa, "Incorporating phase information for source separation via spectrogram factorization, " Proc. ICASSP, 2007, pp. 661- 664.
- (2007) Proc. ICASSP , pp. 661-664
- Parry, R.M.¹ Essa, I.²

26
- 70349380277
- Complex NMF: A new sparse representation for acoustic signals
- H. Kameoka, N. Ono, K. Kashino, and S. Sagayama, "Complex NMF: A new sparse representation for acoustic signals, " Proc. ICASSP, 2009, pp. 3437-3440.
- (2009) Proc. ICASSP , pp. 3437-3440
- Kameoka, H.¹ Ono, N.² Kashino, K.³ Sagayama, S.⁴

27
- 80053599734
- Single-channel source separation using complex matrix factorization
- Nov
- B. King, and L. Atlas, "Single-channel source separation using complex matrix factorization, " IEEE Trans. Audio Speech Lang. Process., vol. 19, no. 8, pp. 2591-2597, Nov. 2011.
- (2011) IEEE Trans. Audio Speech Lang. Process , vol.19 , Issue.8 , pp. 2591-2597
- King, B.¹ Atlas, L.²

28
- 0021407831
- Signal estimation from modified short-time fourier transform
- ASSP- 32, Apr
- D.W. Griffin, and J. S. Lim, "Signal estimation from modified short-time Fourier transform, " IEEE Trans. Acoust. Speech Signal Process., ASSP- 32, no. 2, pp. 236-243, Apr. 1984.
- (1984) IEEE Trans. Acoust. Speech Signal Process , Issue.2 , pp. 236-243
- Griffin, D.W.¹ Lim, J.S.²

29
- 0003639435
- Perceptual evaluation of speech quality (PESQ)
- ITU-R 862
- Perceptual evaluation of speech quality (PESQ), an objective method for end-to-end speech quality assessment of narrowband telephone networks and speech codecs, ITU-R 862, 2001.
- (2001) An Objective Method for End-to-end Speech Quality Assessment of Narrowband Telephone Networks and Speech Codecs

30
- 79960916745
- An algorithm for intelligibility prediction of time frequency weighted noisy speech
- Sep
- C. H. Taal, R. C. Hendriks, R. Heusdens, and J. Jensen, "An algorithm for intelligibility prediction of time frequency weighted noisy speech, " IEEE Trans. Audio, Speech, Lang. Process., vol. 19, no. 7, pp. 2125-2136, Sep. 2011.
- (2011) IEEE Trans. Audio, Speech, Lang. Process , vol.19 , Issue.7 , pp. 2125-2136
- Taal, C.H.¹ Hendriks, R.C.² Heusdens, R.³ Jensen, J.⁴

31
- 44149106061
- Evaluation of objective quality measures for speech enhancement
- Jan
- Y. Hu, and P. C. Loizou, "Evaluation of objective quality measures for speech enhancement, " IEEE Trans. Audio, Speech, Lang. Process., vol. 16, no. 1, pp. 229-238, Jan. 2008.
- (2008) IEEE Trans. Audio, Speech, Lang. Process , vol.16 , Issue.1 , pp. 229-238
- Hu, Y.¹ Loizou, P.C.²

32
- 34547645591
- Effects of noise and distortion on speech quality judgments in normal-hearing and hearing-impaired listeners
- K. H. Arehart, J. M. Kates, M. C. Anderson, and L. O. Harvey, "Effects of noise and distortion on speech quality judgments in normal-hearing and hearing-impaired listeners, " J. Acoust. Soc. Amer., vol. 122, pp. 1150- 1164, 2007.
- (2007) J. Acoust. Soc. Amer , vol.122 , pp. 1150-1164
- Arehart, K.H.¹ Kates, J.M.² Anderson, M.C.³ Harvey, L.O.⁴

33
- 84919905473
- Ideal time-frequency masking algorithms lead to different speech intelligibility and quality in normalhearing and cochlear implant listeners
- Jan
- R. Koning, N. Madhu, and J. Wouters, "Ideal time-frequency masking algorithms lead to different speech intelligibility and quality in normalhearing and cochlear implant listeners, " IEEE Trans. Biomed. Eng., vol. 62, no. 1, pp. 331-341, Jan. 2015.
- (2015) IEEE Trans. Biomed. Eng , vol.62 , Issue.1 , pp. 331-341
- Koning, R.¹ Madhu, N.² Wouters, J.³

34
- 84905693981
- Reconstruction techniques for improving the perceptual quality of binary masked speech
- D. S. Williamson, Y. Wang, and D. L. Wang, "Reconstruction techniques for improving the perceptual quality of binary masked speech, " J. Acoust. Soc. Amer., vol. 136, pp. 892-902, 2014.
- (2014) J. Acoust. Soc. Amer , vol.136 , pp. 892-902
- Williamson, D.S.¹ Wang, Y.² Wang, D.L.³

35
- 84933069322
- On speech quality estimation of phase-aware single-channel speech enhancement
- A. Gaich, and P. Mowlaee, "On speech quality estimation of phase-aware single-channel speech enhancement, " Proc. ICASSP, 2015, pp. 216-220.
- (2015) Proc. ICASSP , pp. 216-220
- Gaich, A.¹ Mowlaee, P.²

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.