메뉴 건너뛰기




Volumn 21, Issue 9, 2013, Pages 1913-1928

A multichannel MMSE-based framework for speech source separation and noise reduction

Author keywords

Blind source separation; microphone arrays; minimum variance distortionless response; minimum mean square error; noise reduction; Wiener filter

Indexed keywords

EXPECTATION - MAXIMIZATIONS; GAUSSIAN MIXTURE MODEL; MICROPHONE ARRAYS; MINIMUM VARIANCE DISTORTIONLESS RESPONSE; MULTI-CHANNEL RECORDING; POSTERIOR PROBABILITY; SECOND ORDER STATISTICS; WIENER FILTERS;

EID: 84880515131     PISSN: 15587916     EISSN: None     Source Type: Journal    
DOI: 10.1109/TASL.2013.2263137     Document Type: Conference Paper
Times cited : (111)

References (50)
  • 1
    • 84867598092 scopus 로고    scopus 로고
    • A multichannel MMSE-based framework for joint blind source separation and noise reduction
    • M. Souden, S. Araki, K. Kinoshita, T. Nakatani, and H. Sawada, "A multichannel MMSE-based framework for joint blind source separation and noise reduction," in Proc. IEEE ICASSP, 2012, pp. 109-112.
    • (2012) Proc. IEEE ICASSP , pp. 109-112
    • Souden, M.1    Araki, S.2    Kinoshita, K.3    Nakatani, T.4    Sawada, H.5
  • 2
    • 0021645331 scopus 로고
    • Speech enhancement using a minimum mean-square error short-time spectral amplitude estimator
    • Dec.
    • Y. Ephraim and D. Malah, "Speech enhancement using a minimum mean-square error short-time spectral amplitude estimator," IEEE Trans. Acoust., Speech, Signal Process., vol. ASSP-32, no. 6, pp. 1109-1121, Dec. 1984.
    • (1984) IEEE Trans. Acoust., Speech, Signal Process. , vol.ASSP-32 , Issue.6 , pp. 1109-1121
    • Ephraim, Y.1    Malah, D.2
  • 3
    • 0029726517 scopus 로고    scopus 로고
    • Speech enhancement based on a priori signal to noise estimation
    • P. Scalart and V. Filho, "Speech enhancement based on a priori signal to noise estimation," in Proc. IEEE ICASSP, 1996, pp. 629-632.
    • (1996) Proc. IEEE ICASSP , pp. 629-632
    • Scalart, P.1    Filho, V.2
  • 6
    • 0035424281 scopus 로고    scopus 로고
    • Signal enhancement using beamforming and nonstationarity with applications to speech
    • DOI 10.1109/78.934132, PII S1053587X01058743
    • S. Gannot, D. Burshtein, and E.Weinstein, "Signal enhancement using beamforming and nonstationarity with applications to speech," IEEE Trans. Signal Process., vol. 49, no. 8, pp. 1614-1626, Aug. 2001. (Pubitemid 32732604)
    • (2001) IEEE Transactions on Signal Processing , vol.49 , Issue.8 , pp. 1614-1626
    • Gannot, S.1    Burshtein, D.2    Weinstein, E.3
  • 7
    • 34447286933 scopus 로고    scopus 로고
    • Frequency-domain criterion for the speech distortion weighted multichannel Wiener filter for robust noise reduction
    • DOI 10.1016/j.specom.2007.02.001, PII S0167639307000313
    • S. Doclo, A. Spriet, J. Wouters, and M. Moonen, "Frequency-domain criterion for the speech distortion weighted multichannel Wiener filter for robust noise reduction," Speech Commun., vol. 49, pp. 636-656, 2007. (Pubitemid 47039261)
    • (2007) Speech Communication , vol.49 , Issue.7-8 , pp. 636-656
    • Doclo, S.1    Spriet, A.2    Wouters, J.3    Moonen, M.4
  • 9
    • 72949120153 scopus 로고    scopus 로고
    • On optimal frequency-domain multichannel linear filtering for noise reduction
    • Feb.
    • M. Souden, J. Benesty, and S. Affes, "On optimal frequency-domain multichannel linear filtering for noise reduction," IEEE Trans. Audio, Speech, Lang. Process., vol. 18, no. 2, pp. 260-276, Feb. 2010.
    • (2010) IEEE Trans. Audio, Speech, Lang. Process. , vol.18 , Issue.2 , pp. 260-276
    • Souden, M.1    Benesty, J.2    Affes, S.3
  • 10
    • 79956282391 scopus 로고    scopus 로고
    • Performance analysis of multichannel Wiener filter based noise reduction in hearing aids under second order statistics estimation errors
    • Jul.
    • B. Cornelis, M. Moonen, and J. Wouters, "Performance analysis of multichannel Wiener filter based noise reduction in hearing aids under second order statistics estimation errors," IEEE Trans. Audio, Speech, Lang. Process., vol. 19, no. 5, pp. 1368-1381, Jul. 2011.
    • (2011) IEEE Trans. Audio, Speech, Lang. Process. , vol.19 , Issue.5 , pp. 1368-1381
    • Cornelis, B.1    Moonen, M.2    Wouters, J.3
  • 11
    • 85008564364 scopus 로고    scopus 로고
    • An integrated solution for online multichannel noise tracking and reduction
    • Sep.
    • M. Souden, J. Chen, J. Benesty, and S. Affes, "An integrated solution for online multichannel noise tracking and reduction," IEEE Trans. Audio, Speech, Lang. Process., vol. 19, no. 7, pp. 2159-2169, Sep. 2011.
    • (2011) IEEE Trans. Audio, Speech, Lang. Process. , vol.19 , Issue.7 , pp. 2159-2169
    • Souden, M.1    Chen, J.2    Benesty, J.3    Affes, S.4
  • 12
    • 84880520635 scopus 로고    scopus 로고
    • Variable speech distortion weighted multichannel Wiener filter based on soft output voice activity detection for noise reduction in hearing aids
    • K. Ngo, A. Spriet, M. Moonen, J.Wouters, and S. H. Jensen, "Variable speech distortion weighted multichannel Wiener filter based on soft output voice activity detection for noise reduction in hearing aids," in Proc. IWAENC, 2008.
    • (2008) Proc. IWAENC
    • Ngo, K.1    Spriet, A.2    Moonen, M.3    Wouters, J.4    Jensen, S.H.5
  • 13
    • 84865754161 scopus 로고    scopus 로고
    • Reduction of highly nonstationary ambient noise by integrating spectral and locational characteristics of speech and noise for robust ASR
    • T. Nakatani, S. Araki, M. Delcroix, T. Yoshioka, and M. Fujimoto, "Reduction of highly nonstationary ambient noise by integrating spectral and locational characteristics of speech and noise for robust ASR," in Proc. ISCA Interspeech, 2011, pp. 1785-1788.
    • (2011) Proc. ISCA Interspeech , pp. 1785-1788
    • Nakatani, T.1    Araki, S.2    Delcroix, M.3    Yoshioka, T.4    Fujimoto, M.5
  • 14
    • 0035396555 scopus 로고    scopus 로고
    • Noise power spectral density estimation based on optimal smoothing and minimum statistics
    • DOI 10.1109/89.928915, PII S106366760104980X
    • R. Martin, "Noise power spectral density estimation based on optimal smoothing and minimum statistics," IEEE Trans. Speech, Audio Process., vol. 9, no. 5, pp. 504-512, Jul. 2001. (Pubitemid 32631178)
    • (2001) IEEE Transactions on Speech and Audio Processing , vol.9 , Issue.5 , pp. 504-512
    • Martin, R.1
  • 15
    • 0036226165 scopus 로고    scopus 로고
    • Noise estimation by minima controlled recursive averaging for robust speech enhancement
    • DOI 10.1109/97.988717, PII S1070990802024100
    • I. Cohen and B. Berdugo, "Noise estimation by minima controlled recursive averaging for robust speech enhancement," IEEE Signal Process. Lett., vol. 9, no. 1, pp. 12-15, Jan. 2002. (Pubitemid 34306628)
    • (2002) IEEE Signal Processing Letters , vol.9 , Issue.1 , pp. 12-15
    • Cohen, I.1    Berdugo, B.2
  • 16
    • 0031234613 scopus 로고    scopus 로고
    • A signal subspace tracking algorithm for microphone array processing of speech
    • PII S1063667697063876
    • S. Affes and Y. Grenier, "A signal subspace tracking algorithm for microphone array processing of speech," IEEE Trans. Speech Audio Process., vol. 5, no. 5, pp. 425-437, Sep. 1997. (Pubitemid 127746015)
    • (1997) IEEE Transactions on Speech and Audio Processing , vol.5 , Issue.5 , pp. 425-437
    • Affes, S.1    Grenier, Y.2
  • 17
    • 0035424281 scopus 로고    scopus 로고
    • Speech enhancement using a mixturemaximum model
    • Aug.
    • D. Burshtein and S. Gannot, "Speech enhancement using a mixturemaximum model," IEEE Trans. Signal Process., vol. 49, no. 8, pp. 1614-1626, Aug. 2001.
    • (2001) IEEE Trans. Signal Process. , vol.49 , Issue.8 , pp. 1614-1626
    • Burshtein, D.1    Gannot, S.2
  • 19
    • 0029411030 scopus 로고
    • An information maximization approach to blind separation and blind deconvolution
    • A. J. Bell and T. J. Sejnowsky, "An information maximization approach to blind separation and blind deconvolution," Neural Computat., vol. 7, pp. 1129-1159, 1995.
    • (1995) Neural Computat. , vol.7 , pp. 1129-1159
    • Bell, A.J.1    Sejnowsky, T.J.2
  • 20
    • 0032629347 scopus 로고    scopus 로고
    • Fast and robust fixed-point algorithms for independent component analysis
    • May
    • A. Hyvarinen, "Fast and robust fixed-point algorithms for independent component analysis," IEEE Trans. Neural Netw., vol. 10, no. 3, pp. 626-634, May 1999.
    • (1999) IEEE Trans. Neural Netw. , vol.10 , Issue.3 , pp. 626-634
    • Hyvarinen, A.1
  • 21
    • 25144522151 scopus 로고    scopus 로고
    • Frequency-domain blind source separation
    • J. Benesty, S. Makino, and J. Chen, Eds. New York, NY, USA: Springer
    • H. Sawada, R. Mukai, S. Araki, and S. Makino, "Frequency-domain blind source separation," in Speech Enhance, J. Benesty, S. Makino, and J. Chen, Eds. New York, NY, USA: Springer, 2005.
    • (2005) Speech Enhance
    • Sawada, H.1    Mukai, R.2    Araki, S.3    Makino, S.4
  • 22
    • 68149150804 scopus 로고    scopus 로고
    • Multichannel eigenspace beamforming in a reverberant noisy environment with multiple interfering speech signals
    • Aug.
    • S. Markovich, S. Gannot, and I. Cohen, "Multichannel eigenspace beamforming in a reverberant noisy environment with multiple interfering speech signals," IEEE Trans. Audio, Speech, Lang. Process., vol. 17, no. 6, pp. 1071-1086, Aug. 2009.
    • (2009) IEEE Trans. Audio, Speech, Lang. Process. , vol.17 , Issue.6 , pp. 1071-1086
    • Markovich, S.1    Gannot, S.2    Cohen, I.3
  • 23
    • 0034980656 scopus 로고    scopus 로고
    • Sound source segregation based on estimating incident angle of each frequency component of input signals acquired by multiple microphones
    • DOI 10.1250/ast.22.149
    • M. Aoki,M. Okamoto, S. Aoki, H. Matsui, T. Sakurai, and Y. Kaneda, "Sound source segregation based on estimating the incident angle of each frequency component of input signals acquired by multiple microphones," Acoust. Sci. Technol., vol. 22, pp. 149-157, Feb. 2001. (Pubitemid 32514398)
    • (2001) Acoustical Science and Technology , vol.22 , Issue.2 , pp. 149-157
    • Aoki, M.1    Okamoto, M.2    Aoki, S.3    Matsui, H.4    Sakurai, T.5    Kaneda, Y.6
  • 24
    • 3142694930 scopus 로고    scopus 로고
    • Blind separation of speech mixtures via time-frequency masking
    • Jul.
    • O. Yilmaz and S. Rickard, "Blind separation of speech mixtures via time-frequency masking," IEEE Trans. Signal Process., vol. 52, no. 7, pp. 1830-1847, Jul. 2004.
    • (2004) IEEE Trans. Signal Process. , vol.52 , Issue.7 , pp. 1830-1847
    • Yilmaz, O.1    Rickard, S.2
  • 25
    • 79960528686 scopus 로고    scopus 로고
    • A versatile framework for speaker separation using a model-based speaker localization approach
    • Sep.
    • N. Madhu and R. Martin, "A versatile framework for speaker separation using a model-based speaker localization approach," IEEE Trans. Audio, Speech Lang. Process., vol. 19, no. 7, pp. 1900-1912, Sep. 2011.
    • (2011) IEEE Trans. Audio, Speech Lang. Process. , vol.19 , Issue.7 , pp. 1900-1912
    • Madhu, N.1    Martin, R.2
  • 26
    • 67149125347 scopus 로고    scopus 로고
    • Stereo source separation and source counting with MAP estimation with Dirichlet prior considering spatial aliasing problem
    • S. Araki, T. Nakatani, H. Sawada, and S. Makino, "Stereo source separation and source counting with MAP estimation with Dirichlet prior considering spatial aliasing problem," in Proc. ICA, 2009, pp. 742-750.
    • (2009) Proc. ICA , pp. 742-750
    • Araki, S.1    Nakatani, T.2    Sawada, H.3    Makino, S.4
  • 27
    • 50249118229 scopus 로고    scopus 로고
    • A two-stage frequency domain blind source separation method for underdetermined convolutive mixtures
    • H. Sawada, S. Araki, and S. Makino, "A two-stage frequency domain blind source separation method for underdetermined convolutive mixtures," in Proc. IEEE WASPAA, 2007, pp. 139-142.
    • (2007) Proc. IEEE WASPAA , pp. 139-142
    • Sawada, H.1    Araki, S.2    Makino, S.3
  • 28
    • 78650016939 scopus 로고    scopus 로고
    • Underdetermined convolutive blind source separation via frequency bin-wise clustering and permutation alignement
    • Mar.
    • H. Sawada, S. Araki, and S. Makino, "Underdetermined convolutive blind source separation via frequency bin-wise clustering and permutation alignement," IEEE Trans. Audio, Speech, Lang. Process., vol. 19, no. 3, pp. 516-527, Mar. 2011.
    • (2011) IEEE Trans. Audio, Speech, Lang. Process. , vol.19 , Issue.3 , pp. 516-527
    • Sawada, H.1    Araki, S.2    Makino, S.3
  • 30
    • 54849417377 scopus 로고    scopus 로고
    • The LOST algorithm: Finding lines and separating speech mixtures
    • P. D. O'Grady and B. A. Pearlmutter, "The LOST algorithm: Finding lines and separating speech mixtures," EURASIP J. Adv. Signal Process., pp. 1-17, 2008.
    • (2008) EURASIP J. Adv. Signal Process. , pp. 1-17
    • O'grady, P.D.1    Pearlmutter, B.A.2
  • 31
    • 78049358902 scopus 로고    scopus 로고
    • Blind separation employing directional statistics in an expectation maximization framework
    • D. H. Tran and R. Haeb-Umbach, "Blind separation employing directional statistics in an expectation maximization framework," in Proc. IEEE ICASSP, 2010, pp. 241-244.
    • (2010) Proc. IEEE ICASSP , pp. 241-244
    • Tran, D.H.1    Haeb-Umbach, R.2
  • 35
    • 0000905617 scopus 로고
    • Adjustment of an inverse matrix corresponding to a change in one element of a given matrix
    • J. Sherman and J. W. Morrison, "Adjustment of an inverse matrix corresponding to a change in one element of a given matrix," Anna. Math. Statist., vol. 21, pp. 124-127, 1950.
    • (1950) Anna. Math. Statist. , vol.21 , pp. 124-127
    • Sherman, J.1    Morrison, J.W.2
  • 36
    • 0033478710 scopus 로고    scopus 로고
    • The complex Watson distribution and shape analysis
    • K. V. Mardia and I. L. Dryden, "The complex Watson distribution and shape analysis," J. R. Statist. Soc.: Ser. B, vol. 61, pp. 913-926, 1999.
    • (1999) J. R. Statist. Soc.: Ser. B , vol.61 , pp. 913-926
    • Mardia, K.V.1    Dryden, I.L.2
  • 37
    • 0001034467 scopus 로고
    • The complex Bingham distribution and shape analysis
    • J. T. Kent, "The complex Bingham distribution and shape analysis," J. R. Statist. Soc., vol. B, pp. 285-299, 1994.
    • (1994) J. R. Statist. Soc. , vol.B , pp. 285-299
    • Kent, J.T.1
  • 38
    • 85032751986 scopus 로고    scopus 로고
    • Single channel multi-talker speech recognition: Graphical modeling approaches
    • Nov.
    • S. Rennie, J. Hershey, and P. Olsen, "Single channel multi-talker speech recognition: Graphical modeling approaches," IEEE Signal Process. Mag., vol. 27, no. 5, pp. 66-80, Nov. 2010.
    • (2010) IEEE Signal Process. Mag. , vol.27 , Issue.5 , pp. 66-80
    • Rennie, S.1    Hershey, J.2    Olsen, P.3
  • 39
    • 0032048385 scopus 로고    scopus 로고
    • Speech recognition in noisy environments using first-order vector Taylor series
    • PII S0167639397000617
    • D. Y. Kim, C. K. Un, and N. S. Kim, "Speech recognition in noisy environments using first-order vector Taylor series," Speech Commun., pp. 39-49, 1998. (Pubitemid 128435865)
    • (1998) Speech Communication , vol.24 , Issue.1 , pp. 39-49
    • Kim, D.Y.1    Un, C.K.2    Kim, N.S.3
  • 41
    • 84880559588 scopus 로고    scopus 로고
    • [Online]. Available: http://tosa.mri.co.jp/sounddb/indexe.htm
  • 44
    • 85008053938 scopus 로고    scopus 로고
    • Convolutive transfer function generalized sidelobe canceler
    • Sep.
    • R. Talmon, I. Cohen, and S. Gannot, "Convolutive transfer function generalized sidelobe canceler," IEEE Trans. Audio, Speech, Lang. Process., vol. 17, no. 7, pp. 1420-1434, Sep. 2009.
    • (2009) IEEE Trans. Audio, Speech, Lang. Process. , vol.17 , Issue.7 , pp. 1420-1434
    • Talmon, R.1    Cohen, I.2    Gannot, S.3
  • 45
    • 33745207361 scopus 로고    scopus 로고
    • A Japanese national project on spontaneous speech corpus and processing technology
    • S. Furui, K.Maekawa, and H. Isahara, "A Japanese national project on spontaneous speech corpus and processing technology," in Proc. ISCA ASR, 2000, pp. 244-248.
    • (2000) Proc. ISCA ASR , pp. 244-248
    • Furui, S.1    Maekawa, K.2    Isahara, H.3
  • 46
    • 78049409757 scopus 로고    scopus 로고
    • Discriminative training based on an integrated view ofMPE and MMI in margin and error space
    • E. McDermott, S. Watanabe, and A. Nakamura, "Discriminative training based on an integrated view ofMPE and MMI in margin and error space," in Proc. IEEE ICASSP, 2010, pp. 4894-4897.
    • (2010) Proc. IEEE ICASSP , pp. 4894-4897
    • McDermott, E.1    Watanabe, S.2    Nakamura, A.3
  • 47
    • 0028996876 scopus 로고
    • Improved backing-off for m-gram language modeling
    • R. Kneser and H. Ney, "Improved backing-off for m-gram language modeling," in Proc. IEEE ICASSP, 1995, pp. 181-184.
    • (1995) Proc. IEEE ICASSP , pp. 181-184
    • Kneser, R.1    Ney, H.2
  • 48
    • 0029288633 scopus 로고
    • Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models
    • C. J. Leggetter and P. C. Woodland, "Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models," Comput. Speech Lang., vol. 9, pp. 171-185, 1995.
    • (1995) Comput. Speech Lang. , vol.9 , pp. 171-185
    • Leggetter, C.J.1    Woodland, P.C.2
  • 49
    • 33645758265 scopus 로고    scopus 로고
    • NTT speech recognizer with outlook on the next generation: SOLON
    • T. Hori, "NTT speech recognizer with outlook on the next generation: SOLON," in Proc. NTT Workshop Commun. Scene Anal., 2004, p. S-6.
    • (2004) Proc. NTT Workshop Commun. Scene Anal.
    • Hori, T.1
  • 50
    • 84880563853 scopus 로고    scopus 로고
    • the Itakura Prize Innovative Young Researcher Award from (ASJ) in 2008
    • [Online]. Available: http://www.kecl.ntt.co.jp/icl/signal/souden/ 2006, and the Itakura Prize Innovative Young Researcher Award from (ASJ) in 2008.
    • (2006)


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.