메뉴 건너뛰기




Volumn 61, Issue 1, 2010, Pages 21-37

Monaural speech separation based on gain adapted minimum mean square error estimation

Author keywords

Gain adaptation; Minimum mean square error estimation; Mixmax approximation; Model based monaural speech separation; Source separation

Indexed keywords

GAIN ADAPTATION; MINIMUM MEAN SQUARE ERROR ESTIMATIONS; MIXMAX APPROXIMATION; MODEL-BASED MONAURAL SPEECH SEPARATION; SOURCE SEPARATION;

EID: 80052817343     PISSN: 19398018     EISSN: 19398115     Source Type: Journal    
DOI: 10.1007/s11265-008-0274-7     Document Type: Conference Paper
Times cited : (9)

References (63)
  • 5
    • 85009230793 scopus 로고    scopus 로고
    • Factorial models and refiltering for speech separation and denoising
    • May
    • Rowies, S. T. (2003). Factorial models and refiltering for speech separation and denoising. In EUROSPEECH-03 (Vol. 7, pp. 1009-1012), May.
    • (2003) EUROSPEECH-03 , vol.7 , pp. 1009-1012
    • Rowies, S.T.1
  • 12
    • 56249144712 scopus 로고    scopus 로고
    • Soft mask methods for singlechannel speaker separation. Audio, Speech and Language Processing
    • Aug.
    • Reddy, A. M., & Raj, B. (2007). Soft mask methods for singlechannel speaker separation. Audio, Speech and Language Processing, IEEE Transactions on, 25(6), 1766-1776, Aug.
    • (2007) IEEE Transactions on , vol.25 , Issue.6 , pp. 1766-1776
    • Reddy, A.M.1    Raj, B.2
  • 13
    • 48149090146 scopus 로고    scopus 로고
    • Estimating single-channel source separation masks: Relevance vector machine classifiers vs. pitch-based masking
    • Oct.
    • Weiss, R., & Ellis, D. (2006). Estimating single-channel source separation masks: Relevance vector machine classifiers vs. pitch-based masking. In Proc. workshop on statistical and perceptual audition SAPA-06 (pp. 31-36), Oct.
    • (2006) Proc. Workshop on Statistical and Perceptual Audition SAPA-06 , pp. 31-36
    • Weiss, R.1    Ellis, D.2
  • 14
    • 4544386386 scopus 로고    scopus 로고
    • Low complexity Bayesian single channel source separation
    • May.
    • Beierholm, T., Pedersen, B. D., & Winther, O. (2004). Low complexity Bayesian single channel source separation. In Proc. ICASSP-04 (Vol. 5, pp. 529-532), May.
    • (2004) Proc. ICASSP-04 , vol.5 , pp. 529-532
    • Beierholm, T.1    Pedersen, B.D.2    Winther, O.3
  • 15
    • 4644257621 scopus 로고    scopus 로고
    • Single microphone source separation using high resolution signal reconstruction
    • May.
    • Kristjansson, T., Attias, T. H., & Hershey, J. (2004). Single microphone source separation using high resolution signal reconstruction. In Proc. ICASSP-04 (pp. 817-820), May.
    • (2004) Proc. ICASSP-04 , pp. 817-820
    • Kristjansson, T.1    Attias, T.H.2    Hershey, J.3
  • 16
    • 33845940172 scopus 로고    scopus 로고
    • A maximum likelihood estimation of vocal-tract-related filter characteristics for single channel speech separation
    • doi: 10.1155/2007/84186
    • Radfar, M. H., Dansereau, R. M., & Sayadiyan (2007). A maximum likelihood estimation of vocal-tract-related filter characteristics for single channel speech separation. EURASIP Journal on Audio, Speech, and Music Processing, 2007, 15, 84186. doi: 10.1155/2007/84186.
    • (2007) EURASIP Journal on Audio, Speech, and Music Processing, 2007 , vol.15 , pp. 84186
    • Radfar, M.H.1    Dansereau, R.M.2    Sayadiyan3
  • 17
    • 34250023466 scopus 로고    scopus 로고
    • Monaural speech segregation based on fusion of source-driven with model-driven techniques
    • DOI 10.1016/j.specom.2007.04.007, PII S0167639307000623
    • Radfar, M. H., Dansereau, R. M., & Sayadiyan, A. (2007). Monaural speech segregation based on fusion of sourcedriven with model-driven techniques. Speech Communication, 49(6), 464-476, June. (Pubitemid 46891624)
    • (2007) Speech Communication , vol.49 , Issue.6 , pp. 464-476
    • Radfar, M.H.1    Dansereau, R.M.2    Sayadiyan, A.3
  • 18
    • 4544247508 scopus 로고    scopus 로고
    • Multiband audio modeling for single channel acoustic source separation
    • May.
    • Reyes-Gomez, M. J., Ellis, D., & Jojic, N. (2004). Multiband audio modeling for single channel acoustic source separation. In Proc. ICASSP-04 (Vol. 5, pp. 641-644), May.
    • (2004) Proc. ICASSP-04 , vol.5 , pp. 641-644
    • Reyes-Gomez, M.J.1    Ellis, D.2    Jojic, N.3
  • 19
    • 85009074940 scopus 로고    scopus 로고
    • A minimum mean squared error estimator for single channel speaker separation
    • Oct.
    • Reddy, A. M., & Raj, B. (2004). A minimum mean squared error estimator for single channel speaker separation. In INTERSPEECH-2004 (pp. 2445-2448), Oct.
    • (2004) INTERSPEECH-2004 , pp. 2445-2448
    • Reddy, A.M.1    Raj, B.2
  • 20
    • 33644639591 scopus 로고    scopus 로고
    • Separation of speech by computational auditory scene analysis
    • New York: Springer
    • Brown, G. J., & Wang, D. L. (2005). Separation of speech by computational auditory scene analysis. In Speech enhancement (pp. 371-402). New York: Springer.
    • (2005) Speech Enhancement , pp. 371-402
    • Brown, G.J.1    Wang, D.L.2
  • 22
    • 0035478859 scopus 로고    scopus 로고
    • The auditory organization of speech and other sources in listeners and computational models
    • DOI 10.1016/S0167-6393(00)00078-9, PII S0167639300000789
    • Cooke, M., & Ellis, D. P. W. (2001). The auditory organization of speech and other sources in listeners and computational models. Speech Communication, 35(3), 141-177, October. (Pubitemid 32922990)
    • (2001) Speech Communication , vol.35 , Issue.3-4 , pp. 141-177
    • Cooke, M.1    Ellis, D.P.W.2
  • 23
    • 0032626792 scopus 로고    scopus 로고
    • Using knowledge to organize sound: The prediction-driven approach to computational auditory scene analysis and its application to speech/nonspeech mixtures
    • April.
    • Ellis, D. P. W. (1999). Using knowledge to organize sound: The prediction-driven approach to computational auditory scene analysis and its application to speech/nonspeech mixtures. Speech Communication, 27(3), 281-298, April.
    • (1999) Speech Communication , vol.27 , Issue.3 , pp. 281-298
    • Ellis, D.P.W.1
  • 24
    • 0032630841 scopus 로고    scopus 로고
    • Harmonic sound stream segregation using localization and its application to speech stream segregation
    • April.
    • Nakatani, T., & Okuno, H. G. (1999). Harmonic sound stream segregation using localization and its application to speech stream segregation. Speech Communication, 27(3), 209-222, April.
    • (1999) Speech Communication , vol.27 , Issue.3 , pp. 209-222
    • Nakatani, T.1    Okuno, H.G.2
  • 25
    • 33644639591 scopus 로고    scopus 로고
    • Separation of speech by computational auditory scene analysis
    • J. Benesty, S. Makino, & J. Chen (Eds.), New York: Springer
    • Brown, G. J., & Wang, D. L. (2005). Separation of speech by computational auditory scene analysis. In J. Benesty, S. Makino, & J. Chen (Eds.), Speech enhancement (pp. 371-402). New York: Springer.
    • (2005) Speech Enhancement , pp. 371-402
    • Brown, G.J.1    Wang, D.L.2
  • 26
    • 0001698589 scopus 로고
    • Auditory grouping
    • B. C. J. Moore (Ed.), chapter Hearing, London: Academic
    • Darwin, C. J., & Carlyon, R. P. (1995). Auditory grouping. In B. C. J. Moore (Ed.), The handbook of perception and cognition (Vol. 6, chapter Hearing, pp. 387-424). London: Academic.
    • (1995) The Handbook of Perception and Cognition , vol.6 , pp. 387-424
    • Darwin, C.J.1    Carlyon, R.P.2
  • 27
    • 0032682770 scopus 로고    scopus 로고
    • Separation of speech from interfering sounds based on oscillatory correlation
    • May.
    • Wang, D. L., & Brown, G. J. (1999). Separation of speech from interfering sounds based on oscillatory correlation. IEEE Transactions on Neural Networks, 10, 684-697, May.
    • (1999) IEEE Transactions on Neural Networks , vol.10 , pp. 684-697
    • Wang, D.L.1    Brown, G.J.2
  • 28
    • 4644265990 scopus 로고    scopus 로고
    • Monaural speech segregation based on pitch tracking and amplitude modulation
    • Sept.
    • Hu, G., & Wang, D. L. (2004). Monaural speech segregation based on pitch tracking and amplitude modulation. IEEE Transactions on Neural Networks, 25(5), 1135-1150, Sept.
    • (2004) IEEE Transactions on Neural Networks , vol.25 , Issue.5 , pp. 1135-1150
    • Hu, G.1    Wang, D.L.2
  • 30
    • 31344466301 scopus 로고    scopus 로고
    • Underdetermined blind source separation based on sparse representation
    • DOI 10.1109/TSP.2005.861743
    • Li, Y., Amari, S., Cichocki, A., Ho, D. W. C., & Shengli, X. (2006). Underdetermined blind source separation based on sparse representation. IEEE Transactions on Speech Audio Processing, 54(2), 423-437, Feb. (Pubitemid 43141892)
    • (2006) IEEE Transactions on Signal Processing , vol.54 , Issue.2 , pp. 423-437
    • Li, Y.1    Amari, S.-I.2    Cichocki, A.3    Ho, D.W.C.4    Xie, S.5
  • 31
    • 31344458539 scopus 로고    scopus 로고
    • Median-based clustering for underdetermined blind signal processing
    • DOI 10.1109/LSP.2005.861590
    • Theis, F. J., Puntonet, C. G., & Lang, E. W. (2006). Medianbased clustering for underdetermined blind signal processing. IEEE Signal Processing Letters, 13(2), 96-99, Feb. (Pubitemid 43137981)
    • (2006) IEEE Signal Processing Letters , vol.13 , Issue.2 , pp. 96-99
    • Theis, F.J.1    Puntonet, C.G.2    Lang, E.W.3
  • 32
    • 0035501128 scopus 로고    scopus 로고
    • Underdetermined blind source separation using sparse representations
    • DOI 10.1016/S0165-1684(01)00120-7, PII S0165168401001207
    • Bofill, P., & Zibulevsky, M. (2001). Underdetermined blind source separation using sparse representations. Signal Process, 81, 2353-2362. (Pubitemid 32943462)
    • (2001) Signal Processing , vol.81 , Issue.11 , pp. 2353-2362
    • Bofill, P.1    Zibulevsky, M.2
  • 33
    • 0026191274 scopus 로고
    • Blind separation of sources, part I. An adaptive algorithm based on neuromimetic architecture
    • DOI 10.1016/0165-1684(91)90079-X
    • Jutten, C., & Herault, J. (1991). Blind separation of sources, Part I: An adaptive algorithm based on neuromimetic architecture. Signal Processing, 24, 1-10. (Pubitemid 21679270)
    • (1991) Signal Processing , vol.24 , Issue.1 , pp. 1-10
    • Jutten, C.1    Herault, J.2
  • 34
    • 0028416938 scopus 로고
    • Independent component analysis, a new concept?
    • Common, P. (1994). Independent component analysis, a new concept? Signal Processing, 36, 287-314.
    • (1994) Signal Processing , vol.36 , pp. 287-314
    • Common, P.1
  • 35
    • 0029411030 scopus 로고
    • An informationmaximization approach to blind separation and blind deconvolution
    • Bell, A. J., & Sejnowski, T. J. (1995). An informationmaximization approach to blind separation and blind deconvolution. Neural Computation, 7, 1129-1159.
    • (1995) Neural Computation , vol.7 , pp. 1129-1159
    • Bell, A.J.1    Sejnowski, T.J.2
  • 36
    • 0031271273 scopus 로고    scopus 로고
    • Blind source separationsemiparametric statistical approach
    • Amari, S. I., & Cardoso, J. F. (1997). Blind source separationsemiparametric statistical approach. IEEE Transactions on Signal Processing, 45(11), 2692-2700.
    • (1997) IEEE Transactions on Signal Processing , vol.45 , Issue.11 , pp. 2692-2700
    • Amari, S.I.1    Cardoso, J.F.2
  • 37
    • 33744550968 scopus 로고    scopus 로고
    • Exploitation of source nonstationarity in underdetermined blind source separation with advanced clustering techniques
    • DOI 10.1109/TSP.2006.873367
    • Luo, Y., Wang, W., Chambers, J. A., Lambotharan, S., & Proudler, I. K. (2006). Exploitation of source non-stationarity in underdetermined blind source separation with advanced clustering techniques. IEEE Transactions Signal Processing, 54(6), 2198-2212, June. (Pubitemid 43811413)
    • (2006) IEEE Transactions on Signal Processing , vol.54 , Issue.6 , pp. 2198-2212
    • Luo, Y.1    Wang, W.2    Chambers, J.A.3    Lambotharan, S.4    Proudler, I.5
  • 38
    • 84898976425 scopus 로고    scopus 로고
    • Learning nonlinear overcomplete representations for efficient coding
    • M. I. Jordan, M. J. Kearns, & S. A. Solla (Eds.), Cambridge: MIT
    • Lewicki, M. S., & Sejnowski, T. J. (1998). Learning nonlinear overcomplete representations for efficient coding. In M. I. Jordan, M. J. Kearns, & S. A. Solla (Eds.), Advances in neural information processing systems (Vol. 10). Cambridge: MIT.
    • (1998) Advances in Neural Information Processing Systems , vol.10
    • Lewicki, M.S.1    Sejnowski, T.J.2
  • 40
    • 85159599773 scopus 로고    scopus 로고
    • Sound source separation using sparse coding with temporal continuity objective
    • Virtanen, T. (2003). Sound source separation using sparse coding with temporal continuity objective. In Proc. Int. Comput. Music Conference (pp. 231-234).
    • (2003) Proc. Int. Comput. Music Conference , pp. 231-234
    • Virtanen, T.1
  • 42
    • 34249994486 scopus 로고    scopus 로고
    • A non-linear minimum mean square error estimator for the mixture-maximization approximation
    • June.
    • Radfar, M. H., Banihashemi, A. H., Dansereau, R. M., & Sayadiyan, A. (2006). A non-linear minimum mean square error estimator for the mixture-maximization approximation. Electronic Letters, 42(12), 75-76, June.
    • (2006) Electronic Letters , vol.42 , Issue.12 , pp. 75-76
    • Radfar, M.H.1    Banihashemi, A.H.2    Dansereau, R.M.3    Sayadiyan, A.4
  • 43
    • 0026881830 scopus 로고
    • Gain-adapted hidden markov models for recognition of clean andnoisy speech
    • Jun.
    • Ephraim, Y. (1992). Gain-adapted hidden markov models for recognition of clean andnoisy speech. IEEE Transactions on Audio, Speech and Language Processing, 40(6), 1303-1316, Jun.
    • (1992) IEEE Transactions on Audio, Speech and Language Processing , vol.40 , Issue.6 , pp. 1303-1316
    • Ephraim, Y.1
  • 46
    • 0017004953 scopus 로고
    • Separation of speech from interfering speech by means of harmonic selection
    • Aug.
    • Parsons, T. W. (1976). Separation of speech from interfering speech by means of harmonic selection. Journal of the Acoustical Society of America, 60, 911-918, Aug.
    • (1976) Journal of the Acoustical Society of America , vol.60 , pp. 911-918
    • Parsons, T.W.1
  • 47
    • 84963932580 scopus 로고    scopus 로고
    • Multipitch trajectory estimation of concurrent speech based on harmonic GMM and nonlinear Kalman filtering
    • Oct.
    • Kameoka, H., Nishimoto, T., & Sagayama, S. (2004). Multipitch trajectory estimation of concurrent speech based on harmonic GMM and nonlinear Kalman filtering. In INTERSPEECH-2004 (Vol. 1, pp. 2433-2436), Oct.
    • (2004) INTERSPEECH-2004 , vol.1 , pp. 2433-2436
    • Kameoka, H.1    Nishimoto, T.2    Sagayama, S.3
  • 48
    • 0032663192 scopus 로고    scopus 로고
    • Multiple period estimation and pitch perception model
    • April.
    • De Cheveigné, A., & Kawahara, H. (1999). Multiple period estimation and pitch perception model. Speech Communication, 27, 175-185, April.
    • (1999) Speech Communication , vol.27 , pp. 175-185
    • De Cheveigné, A.1    Kawahara, H.2
  • 49
    • 0033725389 scopus 로고    scopus 로고
    • Simplified pitch detection algorithm of mixed speech signals
    • May.
    • Kwon, Y. H., Park, D. J., & Ihm, B. C. (2000). Simplified pitch detection algorithm of mixed speech signals. In Proc. ISCAS83 (Vol. 3, pp. 722-725), May.
    • (2000) Proc. ISCAS83 , vol.3 , pp. 722-725
    • Kwon, Y.H.1    Park, D.J.2    Ihm, B.C.3
  • 52
    • 0027307718 scopus 로고
    • Optimal multi-pitch estimation using the em algorithm for co-channel speech separation
    • April.
    • Chazan, D., Stettiner, Y., & Malah, D. (1993). Optimal multi-pitch estimation using the EM algorithm for co-channel speech separation. In Proc. ICASSP-93 (pp. 728-731), April.
    • (1993) Proc. ICASSP-93 , pp. 728-731
    • Chazan, D.1    Stettiner, Y.2    Malah, D.3
  • 53
    • 0022907820 scopus 로고
    • A computational model for separating two simultaneous talkers
    • April.
    • Weintraub, M. (1986). A computational model for separating two simultaneous talkers. In Proc. ICASSP-86 (Vol. 11, pp. 81-84), April.
    • (1986) Proc. ICASSP-86 , vol.11 , pp. 81-84
    • Weintraub, M.1
  • 54
    • 38949205432 scopus 로고
    • The harmonic magnitude suppression (HMS) technique for intelligibility enhancement in the presence of interfering speech
    • Mar.
    • Hanson, B. A., & Wong, D. Y. (1984). The harmonic magnitude suppression (HMS) technique for intelligibility enhancement in the presence of interfering speech. In Proc. ICASSP-84 (Vol. 9, pp. 65-68), Mar.
    • (1984) Proc. ICASSP-84 , vol.9 , pp. 65-68
    • Hanson, B.A.1    Wong, D.Y.2
  • 55
    • 0031237388 scopus 로고    scopus 로고
    • Cochannel speaker separation by harmonic enhancement and suppression
    • PII S1063667697063852
    • Morgan, D. P., George, E. B., Lee, L. T., & Key, S. M. (1997). Cochannel speaker separation by harmonic enhancment and suppression. IEEE Transactions in Acoustics, Speech, and Signal Process, 5(5), 407-424, Sept. (Pubitemid 127746014)
    • (1997) IEEE Transactions on Speech and Audio Processing , vol.5 , Issue.5 , pp. 407-424
    • Morgan, D.P.1    Bryan, G.E.2    Lee, L.T.3    Kay, S.M.4
  • 56
    • 0006433977 scopus 로고
    • Extraction of multiple periodic waveforms from noisy data
    • April.
    • Kanjilal, P. P., & Palit, S. (1994). Extraction of multiple periodic waveforms from noisy data. In Proc. ICASSP-94 ( Vol. 2, pp. 361-364), April.
    • (1994) Proc. ICASSP-94 , vol.2 , pp. 361-364
    • Kanjilal, P.P.1    Palit, S.2
  • 57
    • 0026953470 scopus 로고
    • Lower and upper bounds on the minimum mean-square error in composite source signal estimation
    • Nov.
    • ? 57. Epharaim, Y., & Merhav, N. (1992). Lower and upper bounds on the minimum mean-square error in composite source signal estimation. IEEE Transactions on Information Theory, 38(6), 1709-1724, Nov.
    • (1992) IEEE Transactions on Information Theory , vol.38 , Issue.6 , pp. 1709-1724
    • Epharaim, Y.1    Merhav, N.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.