메뉴 건너뛰기




Volumn 24, Issue 3, 2016, Pages 483-492

Complex ratio masking for monaural speech separation

Author keywords

Complex ideal ratio mask; Deep neural networks; Speech quality; Speech separation

Indexed keywords

COMPLEX NETWORKS; QUALITY CONTROL; SEPARATION; SOURCE SEPARATION; SPEECH ANALYSIS; SPEECH ENHANCEMENT;

EID: 84962808663     PISSN: 23299290     EISSN: None     Source Type: Journal    
DOI: 10.1109/TASLP.2015.2512042     Document Type: Article
Times cited : (871)

References (35)
  • 1
    • 0020167383 scopus 로고
    • The unimportance of phase in speech enhancement
    • Aug
    • D. L. Wang, and J. S. Lim, "The unimportance of phase in speech enhancement, " IEEE Trans. Acoust. Speech Signal Process., ASSP-30, no. 4, pp. 679-681, Aug. 1982.
    • (1982) IEEE Trans. Acoust. Speech Signal Process , vol.ASSP-30 , Issue.4 , pp. 679-681
    • Wang, D.L.1    Lim, J.S.2
  • 2
    • 0021645331 scopus 로고
    • Speech enhancement using a minimum mean-square error short-time spectral amplitude estimator
    • Dec
    • Y. Ephraim, and D. Malah, "Speech enhancement using a minimum mean-square error short-time spectral amplitude estimator, " IEEE Trans. Acoust. Speech Signal Process., ASSP-32, no. 6, pp. 1109-1121, Dec. 1984.
    • (1984) IEEE Trans. Acoust. Speech Signal Process , vol.ASSP-32 , Issue.6 , pp. 1109-1121
    • Ephraim, Y.1    Malah, D.2
  • 3
    • 0019569248 scopus 로고
    • The importance of phase in signals
    • May
    • A. V. Oppenheim, J. S. Lim, "The importance of phase in signals, " Proc. IEEE, vol. 69, no. 5, pp. 529-541, May 1981.
    • (1981) Proc. IEEE , vol.69 , Issue.5 , pp. 529-541
    • Oppenheim, A.V.1    Lim, J.S.2
  • 4
    • 79952363352 scopus 로고    scopus 로고
    • The importance of phase in speech enhancement
    • K. Paliwal, K. Wójcicki, and B. Shannon, "The importance of phase in speech enhancement, " Speech Commun., vol. 53, pp. 465-494, 2010.
    • (2010) Speech Commun , vol.53 , pp. 465-494
    • Paliwal, K.1    Wójcicki, K.2    Shannon, B.3
  • 5
    • 77949635098 scopus 로고    scopus 로고
    • Iterative phase estimation for the synthesis of separated sources from single-channel mixtures
    • May
    • D. Gunawan, and D. Sen, "Iterative phase estimation for the synthesis of separated sources from single-channel mixtures, " IEEE Signal Process. Lett., vol. 17, no. 5, pp. 421-424, May 2010.
    • (2010) IEEE Signal Process. Lett , vol.17 , Issue.5 , pp. 421-424
    • Gunawan, D.1    Sen, D.2
  • 6
    • 84887294721 scopus 로고    scopus 로고
    • Phase estimation for signal reconstruction in single-channel speech separation
    • P. Mowlaee, R. Saeidi, and R. Martin, "Phase estimation for signal reconstruction in single-channel speech separation, " Proc. Interspeech, 2012, pp. 1-4.
    • (2012) Proc. Interspeech , pp. 1-4
    • Mowlaee, P.1    Saeidi, R.2    Martin, R.3
  • 7
    • 84921800494 scopus 로고    scopus 로고
    • STFT phase reconstruction in voiced speech for an improved single-channel speech enhancement
    • Dec
    • M. Krawczyk, and T. Gerkmann, "STFT phase reconstruction in voiced speech for an improved single-channel speech enhancement, " IEEE/ACM Trans. Audio Speech Lang Process., vol. 22, no. 12, pp. 1931-1940, Dec. 2014.
    • (2014) IEEE/ACM Trans. Audio Speech Lang Process , vol.22 , Issue.12 , pp. 1931-1940
    • Krawczyk, M.1    Gerkmann, T.2
  • 8
    • 70349093614 scopus 로고    scopus 로고
    • An algorithm that improves speech intelligibility in noise for normal-hearing listeners
    • G. Kim, Y. Lu, Y. Hu, and P. Loizou, "An algorithm that improves speech intelligibility in noise for normal-hearing listeners, " J. Acoust. Soc. Amer., vol. 126, pp. 1486-1494, 2009.
    • (2009) J. Acoust. Soc. Amer , vol.126 , pp. 1486-1494
    • Kim, G.1    Lu, Y.2    Hu, Y.3    Loizou, P.4
  • 9
    • 84885412715 scopus 로고    scopus 로고
    • An algorithm to improve speech recognition in noise for hearing-impaired listeners
    • E. W. Healy, S. E. Yoho, Y. Wang, and D. L. Wang, "An algorithm to improve speech recognition in noise for hearing-impaired listeners, " J. Acoust. Soc. Amer., vol. 134, pp. 3029-3038, 2013.
    • (2013) J. Acoust. Soc. Amer , vol.134 , pp. 3029-3038
    • Healy, E.W.1    Yoho, S.E.2    Wang, Y.3    Wang, D.L.4
  • 10
    • 84890503044 scopus 로고    scopus 로고
    • Phase randomization - A new paradigm for single-channel signal enhancement
    • K. Sugiyama, and R. Miyahara, "Phase randomization-a new paradigm for single-channel signal enhancement, " Proc. ICASSP, 2013, pp. 7487- 7491.
    • (2013) Proc. ICASSP , pp. 7487-7491
    • Sugiyama, K.1    Miyahara, R.2
  • 11
  • 12
    • 84946080850 scopus 로고    scopus 로고
    • Phase-sensitive and recognition-boosted speech separation using deep recurrent neural networks
    • H. Erdogan, J. R. Hershey, S.Watanabe, and J. L. Roux, "Phase-sensitive and recognition-boosted speech separation using deep recurrent neural networks, " Proc. ICASSP, 2015, pp. 708-712.
    • (2015) Proc. ICASSP , pp. 708-712
    • Erdogan, H.1    Hershey, J.R.2    Watanabe, S.3    Roux, J.L.4
  • 13
    • 84946014781 scopus 로고    scopus 로고
    • A deep neural network for time-domain signal reconstruction
    • Y.Wang, and D. L.Wang, "A deep neural network for time-domain signal reconstruction, " Proc. ICASSP, 2015, pp. 4390-4394.
    • (2015) Proc. ICASSP , pp. 4390-4394
    • Wang, Y.1    Wang, D.L.2
  • 15
    • 84941336645 scopus 로고    scopus 로고
    • Estimating nonnegative matrix model activations with deep neural networks to increase perceptual speech quality
    • D. S. Williamson, Y. Wang, and D. L. Wang, "Estimating nonnegative matrix model activations with deep neural networks to increase perceptual speech quality, " J. Acoust. Soc. Amer., vol. 138, pp. 1399-1407, 2015.
    • (2015) J. Acoust. Soc. Amer , vol.138 , pp. 1399-1407
    • Williamson, D.S.1    Wang, Y.2    Wang, D.L.3
  • 16
    • 84870477511 scopus 로고    scopus 로고
    • Exploring monaural features for classification-based speech segregation
    • Feb
    • Y. Wang, K. Han, and D. L. Wang, "Exploring monaural features for classification-based speech segregation, " IEEE Trans. Audio, Speech, Lang. Process., vol. 21, no. 2, pp. 270-279, Feb. 2013.
    • (2013) IEEE Trans. Audio, Speech, Lang. Process , vol.21 , Issue.2 , pp. 270-279
    • Wang, Y.1    Han, K.2    Wang, D.L.3
  • 17
    • 84910097441 scopus 로고    scopus 로고
    • Boosted deep neural networks and multi-resolution cochleagram features for voice activity detection
    • X.-L. Zhang, and D. L. Wang, "Boosted deep neural networks and multi-resolution cochleagram features for voice activity detection, " Proc. Interspeech, 2014, pp. 1534-1538.
    • (2014) Proc. Interspeech , pp. 1534-1538
    • Zhang, X.-L.1    Wang, D.L.2
  • 18
    • 0031189914 scopus 로고    scopus 로고
    • Multitask learning
    • R. Caruana, "Multitask learning, " Mach. Learn., vol. 28, pp. 41-75, 1997.
    • (1997) Mach. Learn , vol.28 , pp. 41-75
    • Caruana, R.1
  • 19
    • 84862294866 scopus 로고    scopus 로고
    • Deep sparse rectifier neural networks
    • X. Glorot, A. Bordes, and Y. Bengio, "Deep sparse rectifier neural networks, " Proc. AISTATS, 2011, vol. 15, pp. 315-323.
    • (2011) Proc. AISTATS , vol.15 , pp. 315-323
    • Glorot, X.1    Bordes, A.2    Bengio, Y.3
  • 20
    • 80052250414 scopus 로고    scopus 로고
    • Adaptive subgradient methods for online learning and stochastic optimization
    • J. Duchi, E. Hazan, and Y. Singer, "Adaptive subgradient methods for online learning and stochastic optimization, " J. Mach. Learn. Res., vol. 12, pp. 2121-2159, 2010.
    • (2010) J. Mach. Learn. Res , vol.12 , pp. 2121-2159
    • Duchi, J.1    Hazan, E.2    Singer, Y.3
  • 21
    • 0014568991 scopus 로고
    • IEEE recommended practice for speech quality measurements
    • AE-17
    • "IEEE recommended practice for speech quality measurements, " IEEE Trans. Audio Electroacoust., AE-17, pp. 225-246, 1969.
    • (1969) IEEE Trans. Audio Electroacoust , pp. 225-246
  • 24
    • 84921769616 scopus 로고    scopus 로고
    • A feature study for classification-based speech separation at low signal-to-noise ratios
    • Dec
    • J. Chen, Y. Wang, and D. Wang, "A feature study for classification-based speech separation at low signal-to-noise ratios, " IEEE/ACMTrans. Audio, Speech, Lang Process., vol. 22, no. 12, pp. 2112-2121, Dec. 2014.
    • (2014) IEEE/ACMTrans. Audio, Speech, Lang Process , vol.22 , Issue.12 , pp. 2112-2121
    • Chen, J.1    Wang, Y.2    Wang, D.3
  • 25
    • 34547515100 scopus 로고    scopus 로고
    • Incorporating phase information for source separation via spectrogram factorization
    • R. M. Parry, and I. Essa, "Incorporating phase information for source separation via spectrogram factorization, " Proc. ICASSP, 2007, pp. 661- 664.
    • (2007) Proc. ICASSP , pp. 661-664
    • Parry, R.M.1    Essa, I.2
  • 26
    • 70349380277 scopus 로고    scopus 로고
    • Complex NMF: A new sparse representation for acoustic signals
    • H. Kameoka, N. Ono, K. Kashino, and S. Sagayama, "Complex NMF: A new sparse representation for acoustic signals, " Proc. ICASSP, 2009, pp. 3437-3440.
    • (2009) Proc. ICASSP , pp. 3437-3440
    • Kameoka, H.1    Ono, N.2    Kashino, K.3    Sagayama, S.4
  • 27
    • 80053599734 scopus 로고    scopus 로고
    • Single-channel source separation using complex matrix factorization
    • Nov
    • B. King, and L. Atlas, "Single-channel source separation using complex matrix factorization, " IEEE Trans. Audio Speech Lang. Process., vol. 19, no. 8, pp. 2591-2597, Nov. 2011.
    • (2011) IEEE Trans. Audio Speech Lang. Process , vol.19 , Issue.8 , pp. 2591-2597
    • King, B.1    Atlas, L.2
  • 28
    • 0021407831 scopus 로고
    • Signal estimation from modified short-time fourier transform
    • ASSP- 32, Apr
    • D.W. Griffin, and J. S. Lim, "Signal estimation from modified short-time Fourier transform, " IEEE Trans. Acoust. Speech Signal Process., ASSP- 32, no. 2, pp. 236-243, Apr. 1984.
    • (1984) IEEE Trans. Acoust. Speech Signal Process , Issue.2 , pp. 236-243
    • Griffin, D.W.1    Lim, J.S.2
  • 30
    • 79960916745 scopus 로고    scopus 로고
    • An algorithm for intelligibility prediction of time frequency weighted noisy speech
    • Sep
    • C. H. Taal, R. C. Hendriks, R. Heusdens, and J. Jensen, "An algorithm for intelligibility prediction of time frequency weighted noisy speech, " IEEE Trans. Audio, Speech, Lang. Process., vol. 19, no. 7, pp. 2125-2136, Sep. 2011.
    • (2011) IEEE Trans. Audio, Speech, Lang. Process , vol.19 , Issue.7 , pp. 2125-2136
    • Taal, C.H.1    Hendriks, R.C.2    Heusdens, R.3    Jensen, J.4
  • 31
    • 44149106061 scopus 로고    scopus 로고
    • Evaluation of objective quality measures for speech enhancement
    • Jan
    • Y. Hu, and P. C. Loizou, "Evaluation of objective quality measures for speech enhancement, " IEEE Trans. Audio, Speech, Lang. Process., vol. 16, no. 1, pp. 229-238, Jan. 2008.
    • (2008) IEEE Trans. Audio, Speech, Lang. Process , vol.16 , Issue.1 , pp. 229-238
    • Hu, Y.1    Loizou, P.C.2
  • 32
    • 34547645591 scopus 로고    scopus 로고
    • Effects of noise and distortion on speech quality judgments in normal-hearing and hearing-impaired listeners
    • K. H. Arehart, J. M. Kates, M. C. Anderson, and L. O. Harvey, "Effects of noise and distortion on speech quality judgments in normal-hearing and hearing-impaired listeners, " J. Acoust. Soc. Amer., vol. 122, pp. 1150- 1164, 2007.
    • (2007) J. Acoust. Soc. Amer , vol.122 , pp. 1150-1164
    • Arehart, K.H.1    Kates, J.M.2    Anderson, M.C.3    Harvey, L.O.4
  • 33
    • 84919905473 scopus 로고    scopus 로고
    • Ideal time-frequency masking algorithms lead to different speech intelligibility and quality in normalhearing and cochlear implant listeners
    • Jan
    • R. Koning, N. Madhu, and J. Wouters, "Ideal time-frequency masking algorithms lead to different speech intelligibility and quality in normalhearing and cochlear implant listeners, " IEEE Trans. Biomed. Eng., vol. 62, no. 1, pp. 331-341, Jan. 2015.
    • (2015) IEEE Trans. Biomed. Eng , vol.62 , Issue.1 , pp. 331-341
    • Koning, R.1    Madhu, N.2    Wouters, J.3
  • 34
    • 84905693981 scopus 로고    scopus 로고
    • Reconstruction techniques for improving the perceptual quality of binary masked speech
    • D. S. Williamson, Y. Wang, and D. L. Wang, "Reconstruction techniques for improving the perceptual quality of binary masked speech, " J. Acoust. Soc. Amer., vol. 136, pp. 892-902, 2014.
    • (2014) J. Acoust. Soc. Amer , vol.136 , pp. 892-902
    • Williamson, D.S.1    Wang, Y.2    Wang, D.L.3
  • 35
    • 84933069322 scopus 로고    scopus 로고
    • On speech quality estimation of phase-aware single-channel speech enhancement
    • A. Gaich, and P. Mowlaee, "On speech quality estimation of phase-aware single-channel speech enhancement, " Proc. ICASSP, 2015, pp. 216-220.
    • (2015) Proc. ICASSP , pp. 216-220
    • Gaich, A.1    Mowlaee, P.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.