메뉴 건너뛰기




Volumn 21, Issue 1, 2013, Pages 168-177

Towards generalizing classification based speech separation

Author keywords

Generalization; rethresholding; speech separation; support vector machine (SVM)

Indexed keywords

SEPARATION; SIGNAL TO NOISE RATIO; SOURCE SEPARATION; SPEECH ANALYSIS; SPEECH RECOGNITION; SUPERVISED LEARNING;

EID: 84869416544     PISSN: 15587916     EISSN: None     Source Type: Journal    
DOI: 10.1109/TASL.2012.2215596     Document Type: Article
Times cited : (32)

References (43)
  • 1
    • 33748523481 scopus 로고    scopus 로고
    • Determination of the potential benefit of time-frequency gain manipulation
    • DOI 10.1097/01.aud.0000233891.86809.df, PII 0000344620061000000004
    • M. C. Anzalone, L. Calandruccio, K. A. Doherty, and L. H. Carney, "Determination of the potential benefit of time-frequency gain manipulation," Ear Hear., vol. 27, no. 5, pp. 480-492, 2006. (Pubitemid 44371244)
    • (2006) Ear and Hearing , vol.27 , Issue.5 , pp. 480-492
    • Anzalone, M.C.1    Calandruccio, L.2    Doherty, K.A.3    Carney, L.H.4
  • 4
    • 18744396499 scopus 로고    scopus 로고
    • Training text classifiers with SVM on very few positive examples
    • Tech. Rep. MSR-TR-2003-34
    • J. Brank,M. Grobelnik, N.Milic-Frayling, and D.Mladenic, "Training text classifiers with SVM on very few positive examples," Microsoft Corp., Tech. Rep. MSR-TR-2003-34, 2003.
    • (2003) Microsoft Corp.
    • Brank, J.1    Grobelnik, M.2    Milic-Frayling, N.3    Mladenic, D.4
  • 6
    • 33845354768 scopus 로고    scopus 로고
    • Isolating the energetic component of speech-on-speech masking with ideal time-frequency segregation
    • DOI 10.1121/1.2363929
    • D. S. Brungart, P. S. Chang, B. D. Simpson, and D. L. Wang, "Isolating the energetic component of speech-on-speechmaskingwith ideal time-frequency segregation," J. Acoust. Soc. Amer., vol. 120, no. 6, pp. 4007-4018, 2006. (Pubitemid 44888096)
    • (2006) Journal of the Acoustical Society of America , vol.120 , Issue.6 , pp. 4007-4018
    • Brungart, D.S.1    Chang, P.S.2    Simpson, B.D.3    Wang, D.4
  • 8
    • 0021645331 scopus 로고
    • Speech enhancement using a minimum-mean square error short-time spectral amplitude estimator
    • Dec.
    • Y. Ephraim and D. Malah, "Speech enhancement using a minimum-mean square error short-time spectral amplitude estimator," IEEE Trans. Acoust., Speech, Signal Process., vol. ASSP-32, no. 6, pp. 1109-1121, Dec. 1984.
    • (1984) IEEE Trans. Acoust., Speech, Signal Process. , vol.ASSP-32 , Issue.6 , pp. 1109-1121
    • Ephraim, Y.1    Malah, D.2
  • 9
    • 51449104842 scopus 로고    scopus 로고
    • Minimum mean-square error estimation of discrete Fourier coefficients with generalized gamma priors
    • Aug.
    • J. S. Erkelens, R. C. Hendriks, R. Heusdens, and J. Jensen, "Minimum mean-square error estimation of discrete Fourier coefficients with generalized gamma priors," IEEE Trans. Audio, Speech, Lang. Process., vol. 15, no. 6, pp. 1741-1752, Aug. 2007.
    • (2007) IEEE Trans. Audio, Speech, Lang. Process. , vol.15 , Issue.6 , pp. 1741-1752
    • Erkelens, J.S.1    Hendriks, R.C.2    Heusdens, R.3    Jensen, J.4
  • 10
    • 80051643459 scopus 로고    scopus 로고
    • An SVM based classification approach to speech separation
    • K. Han and D. L. Wang, "An SVM based classification approach to speech separation," in Proc. IEEE ICASSP, 2011, pp. 5212-5215.
    • (2011) Proc. IEEE ICASSP , pp. 5212-5215
    • Han, K.1    Wang, D.L.2
  • 11
    • 78049364397 scopus 로고    scopus 로고
    • MMSE based noise PSD tracking with low complexity
    • R. C. Hendriks, R. Heusdens, and J. Jensen, "MMSE based noise PSD tracking with low complexity," in Proc. IEEE ICASSP, 2010, pp. 4266-4269.
    • (2010) Proc. IEEE ICASSP , pp. 4266-4269
    • Hendriks, R.C.1    Heusdens, R.2    Jensen, J.3
  • 13
    • 84867209387 scopus 로고    scopus 로고
    • G. Hu, 100 Nonspeech Sounds, 2006 [Online]. Available: http://www. cse.ohio-state.edu/pnl/corpus/HuCorpus.html
    • (2006) 100 Nonspeech Sounds
    • Hu, G.1
  • 14
    • 4644265990 scopus 로고    scopus 로고
    • Monaural speech segregation based on pitch tracking and amplitude modulation
    • Sep.
    • G. Hu and D. L. Wang, "Monaural speech segregation based on pitch tracking and amplitude modulation," IEEE Trans. Neural Netw., vol. 15, no. 5, pp. 1135-1150, Sep. 2004.
    • (2004) IEEE Trans. Neural Netw. , vol.15 , Issue.5 , pp. 1135-1150
    • Hu, G.1    Wang, D.L.2
  • 15
    • 38849102154 scopus 로고    scopus 로고
    • Auditory segmentation based on onset and offset analysis
    • Feb.
    • G. Hu and D. L. Wang, "Auditory segmentation based on onset and offset analysis," IEEE Trans. Audio, Speech, Lang. Process., vol. 15, no. 2, pp. 396-405, Feb. 2007.
    • (2007) IEEE Trans. Audio, Speech, Lang. Process. , vol.15 , Issue.2 , pp. 396-405
    • Hu, G.1    Wang, D.L.2
  • 16
    • 85008581724 scopus 로고    scopus 로고
    • Spectral magnitude minimum meansquare error estimation using binary and continuous gain functions
    • Jan.
    • J. Jensen and R. C. Hendriks, "Spectral magnitude minimum meansquare error estimation using binary and continuous gain functions," IEEE Trans. Audio, Speech, Lang. Process., vol. 20, no. 1, pp. 92-102, Jan. 2012.
    • (2012) IEEE Trans. Audio, Speech, Lang. Process. , vol.20 , Issue.1 , pp. 92-102
    • Jensen, J.1    Hendriks, R.C.2
  • 17
    • 65249103478 scopus 로고    scopus 로고
    • A supervised learning approach to monaural segregation of reverberant speech
    • May
    • Z. Jin and D. L. Wang, "A supervised learning approach to monaural segregation of reverberant speech," IEEE Trans. Audio, Speech, Lang. Process., vol. 17, no. 4, pp. 625-638, May 2009.
    • (2009) IEEE Trans. Audio, Speech, Lang. Process. , vol.17 , Issue.4 , pp. 625-638
    • Jin, Z.1    Wang, D.L.2
  • 18
    • 85008056718 scopus 로고    scopus 로고
    • HMM-based multipitch tracking for noisy and reverberant speech
    • Jul.
    • Z. Jin and D. L.Wang, "HMM-based multipitch tracking for noisy and reverberant speech," IEEE Trans. Audio, Speech, Lang. Process., vol. 19, no. 5, pp. 1091-1102, Jul. 2011.
    • (2011) IEEE Trans. Audio, Speech, Lang. Process. , vol.19 , Issue.5 , pp. 1091-1102
    • Jin, Z.1    Wang, D.L.2
  • 19
    • 77956547397 scopus 로고    scopus 로고
    • Improving speech intelligibility in noise using environment-optimized algorithms
    • Nov.
    • G. Kim and P. C. Loizou, "Improving speech intelligibility in noise using environment-optimized algorithms," IEEE Trans. Audio, Speech, Lang. Process., vol. 18, no. 8, pp. 2080-2090, Nov. 2010.
    • (2010) IEEE Trans. Audio, Speech, Lang. Process. , vol.18 , Issue.8 , pp. 2080-2090
    • Kim, G.1    Loizou, P.C.2
  • 20
    • 70349093614 scopus 로고    scopus 로고
    • An algorithm that improves speech intelligibility in noise for normal-hearing listeners
    • G. Kim, Y. Lu, Y. Hu, and P. C. Loizou, "An algorithm that improves speech intelligibility in noise for normal-hearing listeners," J. Acoust. Soc. Amer., vol. 126, pp. 1486-1494, 2009.
    • (2009) J. Acoust. Soc. Amer. , vol.126 , pp. 1486-1494
    • Kim, G.1    Lu, Y.2    Hu, Y.3    Loizou, P.C.4
  • 21
    • 40749125179 scopus 로고    scopus 로고
    • Factors influencing intelligibility of ideal binary-masked speech: Implications for noise reduction
    • DOI 10.1121/1.2832617
    • N. Li and P. C. Loizou, "Factors influencing intelligibility of ideal binary-masked speech: Implications for noise reduction," J. Acoust. Soc. Amer., vol. 123, pp. 1673-1682, 2008. (Pubitemid 351379593)
    • (2008) Journal of the Acoustical Society of America , vol.123 , Issue.3 , pp. 1673-1682
    • Li, N.1    Loizou, P.C.2
  • 25
    • 51449094735 scopus 로고    scopus 로고
    • Adaptation of bayesian models for single-channel source separation and its application to voice/music separation in popular songs
    • Jul.
    • A. Ozerov, P. Philippe, F. Bimbot, and R. Gribonval, "Adaptation of bayesian models for single-channel source separation and its application to voice/music separation in popular songs," IEEE Trans. Audio, Speech, Lang. Process., vol. 15, no. 5, pp. 1564-1578, Jul. 2007.
    • (2007) IEEE Trans. Audio, Speech, Lang. Process. , vol.15 , Issue.5 , pp. 1564-1578
    • Ozerov, A.1    Philippe, P.2    Bimbot, F.3    Gribonval, R.4
  • 26
    • 84897584695 scopus 로고    scopus 로고
    • A general flexible framework for the handling of prior information in audio source separation
    • May
    • A. Ozerov, E. Vincent, and F. Bimbot, "A general flexible framework for the handling of prior information in audio source separation," IEEE Trans. Audio, Speech, Lang. Process., vol. 20, no. 4, pp. 1118-1133, May 2012.
    • (2012) IEEE Trans. Audio, Speech, Lang. Process. , vol.20 , Issue.4 , pp. 1118-1133
    • Ozerov, A.1    Vincent, E.2    Bimbot, F.3
  • 28
    • 0003243224 scopus 로고    scopus 로고
    • Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods
    • Cambridge, MA: MIT Press
    • J. C. Platt, "Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods," in Advances in Large Margin Classifiers. Cambridge, MA: MIT Press, 1999, pp. 61-74.
    • (1999) Advances in Large Margin Classifiers , pp. 61-74
    • Platt, J.C.1
  • 29
    • 0142026377 scopus 로고    scopus 로고
    • Speech segregation based on sound localization
    • DOI 10.1121/1.1610463
    • N. Roman, D. L. Wang, and G. J. Brown, "Speech segregation based on sound localization," J. Acoust. Soc. Amer., vol. 114, no. 4, pp. 2236-2252, 2003. (Pubitemid 37266649)
    • (2003) Journal of the Acoustical Society of America , vol.114 , Issue.4 , pp. 2236-2252
    • Roman, N.1    Wang, D.2    Brown, G.J.3
  • 31
    • 0032166087 scopus 로고    scopus 로고
    • HMM-based strategies for enhancement of speech signals embedded in nonstationary noise
    • Sep.
    • H. Sameti, H. Sheikhzadeh, L. Deng, and R. L. Brennan, "HMM-based strategies for enhancement of speech signals embedded in nonstationary noise," IEEE Trans. Speech Audio Process., vol. 6, no. 5, pp. 445-455, Sep. 1998.
    • (1998) IEEE Trans. Speech Audio Process. , vol.6 , Issue.5 , pp. 445-455
    • Sameti, H.1    Sheikhzadeh, H.2    Deng, L.3    Brennan, R.L.4
  • 32
    • 4644317224 scopus 로고    scopus 로고
    • A Bayesian classifier for spectrographic mask estimation for missing feature speech recognition
    • M. L. Seltzer, B. Raj, and R. M. Stern, "A Bayesian classifier for spectrographic mask estimation for missing feature speech recognition," Speech Commun., vol. 43, no. 4, pp. 379-393, 2004.
    • (2004) Speech Commun. , vol.43 , Issue.4 , pp. 379-393
    • Seltzer, M.L.1    Raj, B.2    Stern, R.M.3
  • 33
    • 0032762471 scopus 로고    scopus 로고
    • A statistical model-based voice activity detection
    • Jan.
    • J. Sohn, N. S. Kim, and W. Sung, "A statistical model-based voice activity detection," IEEE Signal Process. Lett., vol. 6, no. 1, pp. 1-3, Jan. 1999.
    • (1999) IEEE Signal Process. Lett. , vol.6 , Issue.1 , pp. 1-3
    • Sohn, J.1    Kim, N.S.2    Sung, W.3
  • 34
    • 70350565063 scopus 로고    scopus 로고
    • On strategies for imbalanced text classification using SVM: A comparative study
    • A. Sun, E. P. Lim, and Y. Liu, "On strategies for imbalanced text classification using SVM: A comparative study," Decision Support Syst., vol. 48, no. 1, pp. 191-201, 2009.
    • (2009) Decision Support Syst. , vol.48 , Issue.1 , pp. 191-201
    • Sun, A.1    Lim, E.P.2    Liu, Y.3
  • 35
    • 0038712550 scopus 로고    scopus 로고
    • SNR estimation based on amplitude modulation analysis with applications to noise suppression
    • Mar.
    • J. Tchorz and B. Kollmeier, "SNR estimation based on amplitude modulation analysis with applications to noise suppression," IEEE Trans. Speech Audio Process., vol. 11, no. 3, pp. 184-192, Mar. 2003.
    • (2003) IEEE Trans. Speech Audio Process. , vol.11 , Issue.3 , pp. 184-192
    • Tchorz, J.1    Kollmeier, B.2
  • 37
    • 84892233308 scopus 로고    scopus 로고
    • On ideal binary mask as the computational goal of auditory scene analysis
    • P. Divenyi, Ed. Norwell, MA, USA: Kluwer ch. 12
    • D. L. Wang, "On ideal binary mask as the computational goal of auditory scene analysis," in Speech Separation by Humans and Machines, P. Divenyi, Ed. Norwell, MA, USA: Kluwer, 2005, ch. 12, pp. 181-197.
    • (2005) Speech Separation by Humans and Machines , pp. 181-197
    • Wang, D.L.1
  • 39
    • 64649103540 scopus 로고    scopus 로고
    • Speech intelligibility in background noise with ideal binary time-frequency masking
    • D. L. Wang, U. Kjems, M. S. Pedersen, J. B. Boldt, and T. Lunner, "Speech intelligibility in background noise with ideal binary time-frequency masking," J. Acoust. Soc. Amer., vol. 125, pp. 2336-2347, 2009.
    • (2009) J. Acoust. Soc. Amer. , vol.125 , pp. 2336-2347
    • Wang, D.L.1    Kjems, U.2    Pedersen, M.S.3    Boldt, J.B.4    Lunner, T.5
  • 41
    • 80051659047 scopus 로고    scopus 로고
    • Combining HMM-based melody extraction and NMF-based soft masking for separating voice and accompaniment from monaural audio
    • Y. Wang and Z. Ou, "Combining HMM-based melody extraction and NMF-based soft masking for separating voice and accompaniment from monaural audio," in Proc. IEEE ICASSP, 2011, pp. 1-4.
    • (2011) Proc. IEEE ICASSP , pp. 1-4
    • Wang, Y.1    Ou, Z.2
  • 42
    • 48149090146 scopus 로고    scopus 로고
    • Estimating single-channel source separation masks: Relevance vector machine classifiers vs. pitch-based masking
    • R. J. Weiss and D. P. W. Ellis, "Estimating single-channel source separation masks: Relevance vector machine classifiers vs. pitch-based masking," in Proc. Workshop Statist. Percept. Audition, 2006, pp. 31-36.
    • (2006) Proc. Workshop Statist. Percept. Audition , pp. 31-36
    • Weiss, R.J.1    Ellis, D.P.W.2
  • 43
    • 51449116166 scopus 로고    scopus 로고
    • HMM-based gainmodeling for enhancement of speech in noise
    • Mar.
    • D.Y.Zhao and W. B. Kleijn, "HMM-based gainmodeling for enhancement of speech in noise," IEEE Trans. Audio, Speech, Lang. Process., vol. 15, no. 3, pp. 882-892, Mar. 2007.
    • (2007) IEEE Trans. Audio, Speech, Lang. Process. , vol.15 , Issue.3 , pp. 882-892
    • Zhao, D.Y.1    Kleijn, W.B.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.