메뉴 건너뛰기




Volumn 22, Issue 12, 2014, Pages 1993-2002

A feature study for classification-based speech separation at low signal-to-noise ratios

Author keywords

ARMA filtering; Classification; Multi resolution cochleagram; Speech separation

Indexed keywords

ACOUSTIC NOISE; CLASSIFICATION (OF INFORMATION); SEPARATION; SPEECH; SPEECH ANALYSIS; SPEECH INTELLIGIBILITY; SPEECH RECOGNITION;

EID: 84921769616     PISSN: 23299290     EISSN: None     Source Type: Journal    
DOI: 10.1109/TASLP.2014.2359159     Document Type: Article
Times cited : (171)

References (42)
  • 1
    • 84874907305 scopus 로고    scopus 로고
    • Perceptual learning for speech in noise after application of binary time-frequency masks,"
    • M. Ahmadi, V. L. Gross, and D. G. Sinex, "Perceptual learning for speech in noise after application of binary time-frequency masks," J.Acoust. Soc. Amer., vol. 133, pp. 1687-1692, 2013.
    • (2013) J.Acoust. Soc. Amer , vol.133 , pp. 1687-1692
    • Ahmadi, M.1    Gross, V.L.2    Sinex, D.G.3
  • 2
    • 84878543263 scopus 로고    scopus 로고
    • The pascal chime speech separation and recognition challenge
    • J. Barker, E. Vincent, N. Ma, H. Christensen, and P. Green, "The PASCAL CHiME speech separation and recognition challenge," Comput. Speech Lang., vol. 27, pp. 621-633, 2013.
    • (2013) Comput. Speech Lang , vol.27 , pp. 621-633
    • Barker, J.1    Vincent, E.2    Ma, N.3    Christensen, H.4    Green, P.5
  • 4
    • 33845354768 scopus 로고    scopus 로고
    • Isolating the energetic component of speech-on-speech masking with ideal timefrequency segregation
    • D. S. Brungart, P. S. Chang, B. D. Simpson, and D. L. Wang, "Isolating the energetic component of speech-on-speech masking with ideal timefrequency segregation," J. Acoust. Soc. Amer., vol. 120, pp.4007-4018, 2006.
    • (2006) J. Acoust. Soc. Amer , vol.120 , pp. 4007-4018
    • Brungart, D.S.1    Chang, P.S.2    Simpson, B.D.3    Wang, D.L.4
  • 6
    • 84905233552 scopus 로고    scopus 로고
    • A feature study for classification- based speech separation at very low signal-to-noise ratio
    • J. Chen, Y. Wang, and D. L. Wang, "A feature study for classification- based speech separation at very low signal-to-noise ratio," in Proc.ICASSP, 2014, pp. 7039-7043.
    • (2014) Proc.ICASSP , pp. 7039-7043
    • Chen, J.1    Wang, Y.2    Wang, D.L.3
  • 7
    • 84863763342 scopus 로고    scopus 로고
    • A pitch estimation filter robust to high levels of noise (pefac)
    • S. Gonzalez and M. Brookes, "A pitch estimation filter robust to high levels of noise (PEFAC)," in Proc. Euro. Sig. Process. Conf., 2011, pp.451-455.
    • (2011) Proc. Euro. Sig. Process. Conf , pp. 451-455
    • Gonzalez, S.1    Brookes, M.2
  • 8
    • 84869105129 scopus 로고    scopus 로고
    • A classification based approach to speech segregation
    • K. Han and D. L. Wang, "A classification based approach to speech segregation," J. Acoust. Soc. Amer., vol. 132, pp. 3475-3483, 2012.
    • (2012) J. Acoust. Soc. Amer , vol.132 , pp. 3475-3483
    • Han, K.1    Wang, D.L.2
  • 9
    • 84885412715 scopus 로고    scopus 로고
    • An algorithm to improve speech recognition in noise for hearing-impaired listeners
    • E. W. Healy, S. E. Yoho, Y. Wang, and D. L. Wang, "An algorithm to improve speech recognition in noise for hearing-impaired listeners," J.Acoust. Soc. Amer., vol. 134, pp. 3029-3038, 2013.
    • (2013) J.Acoust. Soc. Amer , vol.134 , pp. 3029-3038
    • Healy, E.W.1    Yoho, S.E.2    Wang, Y.3    Wang, D.L.4
  • 10
    • 0025041264 scopus 로고
    • Perceptual linear predictive (plp) analysis of speech
    • H. Hermansky, "Perceptual linear predictive (PLP) analysis of speech," J. Acoust. Soc. Amer., vol. 87, pp. 1738-1752, 1990.
    • (1990) J. Acoust. Soc. Amer , vol.87 , pp. 1738-1752
    • Hermansky, H.1
  • 12
    • 38849102154 scopus 로고    scopus 로고
    • Auditory segmentation based on onset and offset analysis
    • Feb
    • G. Hu and D. L. Wang, "Auditory segmentation based on onset and offset analysis," IEEE Trans. Audio, Speech, Lang. Process., vol. 15, no. 2, pp. 396-405, Feb. 2007.
    • (2007) IEEE Trans. Audio, Speech, Lang. Process , vol.15 , Issue.2 , pp. 396-405
    • Hu, G.1    Wang, D.L.2
  • 13
    • 49249107353 scopus 로고    scopus 로고
    • Segregation of unvoiced speech from nonspeech interference
    • G. Hu and D. L. Wang, "Segregation of unvoiced speech from nonspeech interference," J. Acoust. Soc. Amer., vol. 124, pp. 1306-1319, 2008.
    • (2008) J. Acoust. Soc. Amer , vol.124 , pp. 1306-1319
    • Hu, G.1    Wang, D.L.2
  • 14
    • 77955695149 scopus 로고    scopus 로고
    • A tandem algorithm for pitch estimation and voiced speech segregation
    • Nov
    • G. Hu and D. L. Wang, "A tandem algorithm for pitch estimation and voiced speech segregation," IEEE Trans. Audio, Speech, Lang.Process., vol. 18, no. 8, pp. 2067-2079, Nov. 2010.
    • (2010) IEEE Trans. Audio, Speech, Lang.Process , vol.18 , Issue.8 , pp. 2067-2079
    • Hu, G.1    Wang, D.L.2
  • 15
    • 35248891610 scopus 로고    scopus 로고
    • A comparative intelligibility study of singlemicrophone noise reduction algorithms
    • Y. Hu and P. C. Loizou, "A comparative intelligibility study of singlemicrophone noise reduction algorithms," J. Acoust. Soc. Amer., vol.122, pp. 1777-1786, 2007.
    • (2007) J. Acoust. Soc. Amer , vol.122 , pp. 1777-1786
    • Hu, Y.1    Loizou, P.C.2
  • 16
    • 0014568991 scopus 로고
    • IEEE recommended practice for speech quality measurements
    • IEEE
    • IEEE, "IEEE recommended practice for speech quality measurements," IEEE Trans. Audio Electroacoust., vol. 17, pp. 225-246, 1969.
    • (1969) IEEE Trans. Audio Electroacoust , vol.17 , pp. 225-246
  • 17
    • 0141702077 scopus 로고    scopus 로고
    • Phase autocorrelation (pac) derived robust speech features
    • S. Ikbal, H. Misra, and H. Bourlard, "Phase autocorrelation (PAC) derived robust speech features," in Proc. ICASSP, 2003, pp. 133-136.
    • (2003) Proc. ICASSP , pp. 133-136
    • Ikbal, S.1    Misra, H.2    Bourlard, H.3
  • 18
    • 65249103478 scopus 로고    scopus 로고
    • A supervised learning approach to monaural segregation of reverberant speech
    • May
    • Z. Jin and D. L. Wang, "A supervised learning approach to monaural segregation of reverberant speech," IEEE Trans. Audio, Speech, Lang.Process., vol. 17, no. 4, pp. 625-638, May 2009.
    • (2009) IEEE Trans. Audio, Speech, Lang.Process , vol.17 , Issue.4 , pp. 625-638
    • Jin, Z.1    Wang, D.L.2
  • 19
    • 84867608537 scopus 로고    scopus 로고
    • Power-normalized cepstral coefficients (pncc) for robust speech recognition
    • C. Kim and R. Stern, "Power-normalized cepstral coefficients (PNCC) for robust speech recognition," in Proc. ICASSP, 2012, pp. 4101-4104.
    • (2012) Proc. ICASSP , pp. 4101-4104
    • Kim, C.1    Stern, R.2
  • 20
    • 79959824321 scopus 로고    scopus 로고
    • Nonlinear enhancement of onset for robust speech recognition
    • C. Kim and R. M. Stern, "Nonlinear enhancement of onset for robust speech recognition," in Proc. Interspeech, 2010, pp. 2058-2061.
    • (2010) Proc. Interspeech , pp. 2058-2061
    • Kim, C.1    Stern, R.M.2
  • 21
    • 0032785783 scopus 로고    scopus 로고
    • Auditory processing of speech signals for robust speech recognition in real-world noisy environments
    • Jan
    • D.-S. Kim, S.-Y. Lee, and R. M. Kil, "Auditory processing of speech signals for robust speech recognition in real-world noisy environments," IEEE Trans. Speech Audio Process., vol. 7, no. 1, pp. 55-69, Jan. 1999.
    • (1999) IEEE Trans. Speech Audio Process , vol.7 , Issue.1 , pp. 55-69
    • Kim, D.-S.1    Lee, S.-Y.2    Kil, R.M.3
  • 22
    • 70349093614 scopus 로고    scopus 로고
    • An algorithm that improves speech intelligibility in noise for normal-hearing listeners
    • G. Kim, Y. Lu, Y. Hu, and P. C. Loizou, "An algorithm that improves speech intelligibility in noise for normal-hearing listeners," J. Acoust.Soc. Amer., vol. 126, pp. 1486-1494, 2009.
    • (2009) J. Acoust.Soc. Amer , vol.126 , pp. 1486-1494
    • Kim, G.1    Lu, Y.2    Hu, Y.3    Loizou, P.C.4
  • 23
    • 80051618418 scopus 로고    scopus 로고
    • Delta-spectral cepstral coefficients for robust speech recognition
    • K. Kumar, C. Kim, and R. M. Stern, "Delta-spectral cepstral coefficients for robust speech recognition," in Proc. ICASSP, 2011, pp.4784-4787.
    • (2011) Proc. ICASSP , pp. 4784-4787
    • Kumar, K.1    Kim, C.2    Stern, R.M.3
  • 24
    • 40749125179 scopus 로고    scopus 로고
    • Factors influencing intelligibility of ideal binary- masked speech: Implications for noise reduction
    • N. Li and P. C. Loizou, "Factors influencing intelligibility of ideal binary- masked speech: Implications for noise reduction," J. Acoust. Soc.Amer., vol. 123, pp. 1673-1682, 2008.
    • (2008) J. Acoust. Soc.Amer , vol.123 , pp. 1673-1682
    • Li, N.1    Loizou, P.C.2
  • 25
    • 34447100796 scopus 로고    scopus 로고
    • Speech enhancement
    • Boca Raton, FL, USA: CRC
    • P. C. Loizou, "Speech enhancement," in theory and practice. Boca Raton, FL, USA: CRC, 2007.
    • (2007) Theory and Practice
    • Loizou, P.C.1
  • 26
    • 79959857490 scopus 로고    scopus 로고
    • An auditory basedmodulation spectral feature for reverberant speech recognition
    • H. K.Maganti and M.Matassoni, "An auditory basedmodulation spectral feature for reverberant speech recognition," in Proc. Interspeech, 2010, pp. 570-573.
    • (2010) Proc. Interspeech , pp. 570-573
    • Maganti, H.K.1    Matassoni, M.2
  • 29
    • 84871829474 scopus 로고    scopus 로고
    • A multistream feature framework based on bandpass modulation filtering for robust speech recognition
    • Feb
    • S. K. Nemala, K. Patil, and M. Elhilali, "A multistream feature framework based on bandpass modulation filtering for robust speech recognition," IEEE Trans. Audio, Speech, Lang. Process., vol. 21, no. 2, pp.416-426, Feb. 2013.
    • (2013) IEEE Trans. Audio, Speech, Lang. Process , vol.21 , Issue.2 , pp. 416-426
    • Nemala, S.K.1    Patil, K.2    Elhilali, M.3
  • 31
    • 84863799482 scopus 로고    scopus 로고
    • Spectro-temporal modulation subspace-spanning filter bank features for robust automatic speech recognition
    • M. R. Schädler, B. T. Meyer, and B. Kollmeier, "Spectro-temporal modulation subspace-spanning filter bank features for robust automatic speech recognition," J. Acoust. Soc. Amer., vol. 131, pp. 4134-4151, 2012.
    • (2012) J. Acoust. Soc. Amer , vol.131 , pp. 4134-4151
    • Schädler, M.R.1    Meyer, B.T.2    Kollmeier, B.3
  • 32
    • 33750344712 scopus 로고    scopus 로고
    • Feature extraction from higher lag autocorrelation coefficients for robust speech recognition
    • B. J. Shannon and K. K. Paliwal, "Feature extraction from higher lag autocorrelation coefficients for robust speech recognition," Speech Commun., vol. 48, pp. 1458-1485, 2006.
    • (2006) Speech Commun , vol.48 , pp. 1458-1485
    • Shannon, B.J.1    Paliwal, K.K.2
  • 33
    • 51449101666 scopus 로고    scopus 로고
    • Robust speaker identification using auditory features and computational auditory scene analysis
    • Y. Shao and D. L.Wang, "Robust speaker identification using auditory features and computational auditory scene analysis," in Proc. ICASSP, 2008, pp. 1589-1592.
    • (2008) Proc. ICASSP , pp. 1589-1592
    • Shao, Y.1    Wang, D.L.2
  • 34
    • 0027623210 scopus 로고
    • Assessment for automatic speech recognition: II. Noisex-92: A database and an experiment to study the effect of additive noise on speech recognition systems
    • A. Varga and H. J. Steeneken, "Assessment for automatic speech recognition: II. NOISEX-92: A database and an experiment to study the effect of additive noise on speech recognition systems," Speech Commun., vol. 12, pp. 247-251, 1993.
    • (1993) Speech Commun , vol.12 , pp. 247-251
    • Varga, A.1    Steeneken, H.J.2
  • 35
    • 84892233308 scopus 로고    scopus 로고
    • On ideal binary mask as the computational goal of auditory scene analysis
    • P. Divenyi, Ed. Boston, MA, USA: Kluwer
    • D. L. Wang, "On ideal binary mask as the computational goal of auditory scene analysis," in Speech separation by humans and machines, P. Divenyi, Ed. Boston, MA, USA: Kluwer, 2005, pp. 181-197.
    • (2005) Speech Separation by Humans and Machines , pp. 181-197
    • Wang, D.L.1
  • 36
    • 33847030124 scopus 로고    scopus 로고
    • Computational auditory scene analysis
    • Hoboken, NJ, USA: Wiley-IEEE Press
    • D. L.Wang and G. J. Brown, "Computational auditory scene analysis," in Principles, algorithms and applications.. Hoboken, NJ, USA: Wiley-IEEE Press, 2006.
    • (2006) Principles, Algorithms and Applications
    • Wang, D.L.1    Brown, G.J.2
  • 37
    • 64649103540 scopus 로고    scopus 로고
    • Speech intelligibility in background noise with ideal binary time-frequency masking
    • D. L. Wang, U. Kjems, M. S. Pedersen, J. B. Boldt, and T. Lunner, "Speech intelligibility in background noise with ideal binary time-frequency masking," J. Acoust. Soc. Amer., vol. 125, pp. 2336-2347, 2009.
    • (2009) J. Acoust. Soc. Amer , vol.125 , pp. 2336-2347
    • Wang, D.L.1    Kjems, U.2    Pedersen, M.S.3    Boldt, J.B.4    Lunner, T.5
  • 38
    • 84870477511 scopus 로고    scopus 로고
    • Exploring monaural features for classification-based speech segregation
    • Feb
    • Y. Wang, K. Han, and D. L. Wang, "Exploring monaural features for classification-based speech segregation," IEEE Trans. Audio, Speech, Lang. Process., vol. 21, no. 2, pp. 270-279, Feb. 2013.
    • (2013) IEEE Trans. Audio, Speech, Lang. Process , vol.21 , Issue.2 , pp. 270-279
    • Wang, Y.1    Han, K.2    Wang, D.L.3
  • 39
    • 84875678689 scopus 로고    scopus 로고
    • Towards scaling up classification-based speech separation
    • Jul
    • Y. Wang and D. L. Wang, "Towards scaling up classification-based speech separation," IEEE Trans. Audio, Speech, Lang. Process., vol.21, no. 7, pp. 1381-1390, Jul. 2013.
    • (2013) IEEE Trans. Audio, Speech, Lang. Process , vol.21 , Issue.7 , pp. 1381-1390
    • Wang, Y.1    Wang, D.L.2
  • 40
    • 0032623471 scopus 로고    scopus 로고
    • Robust features for noisy speech recognition based on temporal trajectory filtering of short-time autocorrelation sequences
    • K. Yuo and H. Wang, "Robust features for noisy speech recognition based on temporal trajectory filtering of short-time autocorrelation sequences," Speech Commun., vol. 28, pp. 13-24, 1999.
    • (1999) Speech Commun , vol.28 , pp. 13-24
    • Yuo, K.1    Wang, H.2
  • 41
    • 84910097441 scopus 로고    scopus 로고
    • Boosted deep neural networks and multiresolution cochleagram features for voice activity detection
    • X.-L. Zhang and D. L.Wang, "Boosted deep neural networks and multiresolution cochleagram features for voice activity detection," in Proc.Interspeech, 2014, pp. 1534-1538.
    • (2014) Proc.Interspeech , pp. 1534-1538
    • Zhang, X.-L.1    Wang, D.L.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.