메뉴 건너뛰기




Volumn 49, Issue 6, 2007, Pages 464-476

Monaural speech segregation based on fusion of source-driven with model-driven techniques

Author keywords

CASA; Envelope extraction; Harmonic modelling; MIXMAX estimator; Monaural speech segregation; Multi pitch tracking; Speech coding; Speech processing; Vector quantization

Indexed keywords

HARMONIC GENERATION; MEAN SQUARE ERROR; SPEECH ANALYSIS; SPEECH SYNTHESIS; VECTOR QUANTIZATION;

EID: 34250023466     PISSN: 01676393     EISSN: None     Source Type: Journal    
DOI: 10.1016/j.specom.2007.04.007     Document Type: Article
Times cited : (23)

References (49)
  • 1
    • 11144316019 scopus 로고    scopus 로고
    • Decoding speech in the presence of other sources speech
    • Barker J.P., Cooke M.P., and Ellis D.P.W. Decoding speech in the presence of other sources speech. Speech Comm. 45 1 (2005) 5-25
    • (2005) Speech Comm. , vol.45 , Issue.1 , pp. 5-25
    • Barker, J.P.1    Cooke, M.P.2    Ellis, D.P.W.3
  • 2
    • 4544386386 scopus 로고    scopus 로고
    • Beierholm, T., Pedersen, B.D., Winther, O., 2004. Low complexity Bayesian single channel source separation. In: Proc. ICASSP-04, May 2004, Vol. 5, pp. 529-532.
  • 3
    • 0029411030 scopus 로고
    • An information-maximization approach to blind separation and blind deconvolution
    • Bell A.J., and Sejnowski T.J. An information-maximization approach to blind separation and blind deconvolution. Neural Comput. 7 (1995) 1129-1159
    • (1995) Neural Comput. , vol.7 , pp. 1129-1159
    • Bell, A.J.1    Sejnowski, T.J.2
  • 5
    • 0036754453 scopus 로고    scopus 로고
    • Speech enhancement using a mixture-maximum model
    • Burshtein D., and Gannot S. Speech enhancement using a mixture-maximum model. IEEE Trans. Speech Audio Process. 10 6 (2002) 341-351
    • (2002) IEEE Trans. Speech Audio Process. , vol.10 , Issue.6 , pp. 341-351
    • Burshtein, D.1    Gannot, S.2
  • 6
    • 0027307718 scopus 로고    scopus 로고
    • Chazan, D., Stettiner, Y., Malah, D., 1993. Optimal multi-pitch estimation using the EM algorithm for co-channel speech separation. In: Proc. ICASSP-93, April 1993, pp. 728-731.
  • 7
    • 14944373601 scopus 로고    scopus 로고
    • Vector quantization of harmonic magnitudes in speech coding applications: a survey and new technique
    • Chu W.C. Vector quantization of harmonic magnitudes in speech coding applications: a survey and new technique. EURASIP J. Appl. Signal Process. 17 (2004) 2601-2613
    • (2004) EURASIP J. Appl. Signal Process. , vol.17 , pp. 2601-2613
    • Chu, W.C.1
  • 8
    • 33750368310 scopus 로고    scopus 로고
    • An audio-visual corpus for speech perception and automatic speech recognition
    • Cooke M.P., Barker J., Cunningham S.P., and Shao X. An audio-visual corpus for speech perception and automatic speech recognition. JASA 120 (2006) 2421-2424
    • (2006) JASA , vol.120 , pp. 2421-2424
    • Cooke, M.P.1    Barker, J.2    Cunningham, S.P.3    Shao, X.4
  • 11
    • 0031619075 scopus 로고    scopus 로고
    • Eriksson, T., Hong-Goo, K., Stylianou, Y., 1998. Quantization of the spectral envelope for sinusoidal coders. In: Proc. ICASSP-98, May 1998, Vol. 1, pp. 37-40.
  • 13
    • 33845954761 scopus 로고    scopus 로고
    • A Bayesian approach for blind separation of sparse sources
    • Fevotte C., and Godsill S.J. A Bayesian approach for blind separation of sparse sources. IEEE Trans. Speech Audio Process. 4 99 (2005) 1-15
    • (2005) IEEE Trans. Speech Audio Process. , vol.4 , Issue.99 , pp. 1-15
    • Fevotte, C.1    Godsill, S.J.2
  • 15
    • 0038132749 scopus 로고    scopus 로고
    • A variational method for learning sparse and overcomplete representations
    • Girolami M. A variational method for learning sparse and overcomplete representations. Neural Comput. 13 11 (2001) 2517-2532
    • (2001) Neural Comput. , vol.13 , Issue.11 , pp. 2517-2532
    • Girolami, M.1
  • 16
    • 34250014858 scopus 로고    scopus 로고
    • Hanson, B.A., Wong, D.Y., 1984. The harmonic magnitude suppression (HMS) technique for intelligibility enhancement in the presence of interfering speech. In: Proc. ICASSP-84, March 1984, Vol. 9, pp. 65-68.
  • 17
    • 4644265990 scopus 로고    scopus 로고
    • Monaural speech segregation based on pitch tracking and amplitude modulation
    • Hu G., and Wang D.L. Monaural speech segregation based on pitch tracking and amplitude modulation. IEEE Trans. Neural Networks 15 5 (2004) 1135-1150
    • (2004) IEEE Trans. Neural Networks , vol.15 , Issue.5 , pp. 1135-1150
    • Hu, G.1    Wang, D.L.2
  • 18
    • 84899014722 scopus 로고    scopus 로고
    • Jang, G.J., Lee, T.W., 2003. A probabilistic approach to single channel source separation. In: Proc. Advances in Neural Information Processing Systems, pp. 1173-1180.
  • 19
    • 84963932580 scopus 로고    scopus 로고
    • Kameoka, H., Nishimoto, T., Sagayama, S., 2004. Multi-pitch trajectory estimation of concurrent speech based on harmonic GMM and nonlinear Kalman filtering. In: INTERSPEECH-2004, October 2004, Vol. 1, pp. 2433-2436.
  • 20
    • 4644257621 scopus 로고    scopus 로고
    • Kristjansson, T., Attias, H., Hershey, J., 2004. Single microphone source separation using high resolution signal reconstruction. In: Proc. ICASSP-04, May 2004, pp. 817-820.
  • 21
    • 85117866826 scopus 로고    scopus 로고
    • Kwon, Y.H., Park, D.J., Ihm, B.C., 2000. Simplified pitch detection algorithm of mixed speech signals. In: Proc. ISCAS-83, May 2000, Vol. 3, pp. 722-725.
  • 22
    • 0032624821 scopus 로고    scopus 로고
    • Blind source separation of more sources than mixtures using overcomplete representations
    • Lee T.-W., Lewicki M.S., Girolami M., and Sejnowski T.J. Blind source separation of more sources than mixtures using overcomplete representations. IEEE Signal Process. Lett. 6 4 (1999) 87-90
    • (1999) IEEE Signal Process. Lett. , vol.6 , Issue.4 , pp. 87-90
    • Lee, T.-W.1    Lewicki, M.S.2    Girolami, M.3    Sejnowski, T.J.4
  • 23
    • 84940406551 scopus 로고    scopus 로고
    • Martin, P., 1982. Comparison of pitch detection by cepstrum and spectral comb analysis. In: Proc. ICASSP-82, May 1982, Vol. 7, pp. 180-183.
  • 28
    • 0023168574 scopus 로고    scopus 로고
    • Naylor, J.A., Boll, S.F., 1987. Techniques for suppression of an interfering talker in co-channel speech. In: Proc. ICASSP-87, April 1987, Vol. 1, pp. 205-208.
  • 29
    • 13544259544 scopus 로고    scopus 로고
    • On the usefulness of STFT phase spectrum in human listening tests
    • Paliwal K.K., and Alsteris L.D. On the usefulness of STFT phase spectrum in human listening tests. Speech Comm. 45 2 (2005) 153-170
    • (2005) Speech Comm. , vol.45 , Issue.2 , pp. 153-170
    • Paliwal, K.K.1    Alsteris, L.D.2
  • 30
    • 0017004953 scopus 로고
    • Separation of speech from interfering speech by means of harmonic selection
    • Parsons T.W. Separation of speech from interfering speech by means of harmonic selection. J. Acoust. Soc. Amer. 60 Aug. (1976) 911-918
    • (1976) J. Acoust. Soc. Amer. , vol.60 , Issue.Aug , pp. 911-918
    • Parsons, T.W.1
  • 32
    • 0043069843 scopus 로고    scopus 로고
    • Squared error as a measure of perceived phase distortion
    • Pobloth H., and Kleijn W.B. Squared error as a measure of perceived phase distortion. J. Acoust. Soc. Amer. 114 2 (2003) 1081-1094
    • (2003) J. Acoust. Soc. Amer. , vol.114 , Issue.2 , pp. 1081-1094
    • Pobloth, H.1    Kleijn, W.B.2
  • 33
    • 0025256257 scopus 로고
    • An approach to co-channel talker interference suppression using a sinusoidal model for speech
    • Quatieri T.F., and Danisewicz R.G. An approach to co-channel talker interference suppression using a sinusoidal model for speech. IEEE Trans. Acoust. Speech Signal Process. 38 (1990) 56-69
    • (1990) IEEE Trans. Acoust. Speech Signal Process. , vol.38 , pp. 56-69
    • Quatieri, T.F.1    Danisewicz, R.G.2
  • 35
    • 34249994486 scopus 로고    scopus 로고
    • A non-linear minimum mean square error estimator for the mixture-maximization approximation
    • Radfar M.H., Banihashemi A.H., Dansereau R.M., and Sayadiyan A. A non-linear minimum mean square error estimator for the mixture-maximization approximation. Electron. Lett. 42 12 (2006) 75-76
    • (2006) Electron. Lett. , vol.42 , Issue.12 , pp. 75-76
    • Radfar, M.H.1    Banihashemi, A.H.2    Dansereau, R.M.3    Sayadiyan, A.4
  • 36
    • 85009074940 scopus 로고    scopus 로고
    • Reddy, A.M., Raj, B., 2004. A minimum mean squared error estimator for single channel speaker separation. In: INTERSPEECH-2004, October 2004, pp. 2445-2448.
  • 37
    • 4544247508 scopus 로고    scopus 로고
    • Reyes-Gomez, M.J., Ellis, D., Jojic, N., 2004. Multiband audio modeling for single channel acoustic source separation. In: Proc. ICASSP-04, May 2004, Vol. 5, pp. 641-644.
  • 38
    • 34249997972 scopus 로고    scopus 로고
    • Roweis, S., 2000. One microphone source separation. In: Proc. Neural Information Processing Systems, pp. 793-799.
  • 39
    • 85009230793 scopus 로고    scopus 로고
    • Rowies, S.T., 2003. Factorial models and refiltering for speech separation and denoising. In: EUROSPEECH-03, May 2003, Vol. 7, pp. 1009-1012.
  • 41
    • 0020497760 scopus 로고    scopus 로고
    • Secrest, B., Doddington, G., 1983. An integrated pitch tracking algorithm for speech systems. In: Proc. ICASSP-83, April 1983, Vol. 8, pp. 1352-1355.
  • 43
    • 0001455934 scopus 로고
    • Robust pitch tracking
    • Kleijn W., and Paliwal K. (Eds), Elsevier
    • Talkin D. Robust pitch tracking. In: Kleijn W., and Paliwal K. (Eds). Speech Coding and Synthesis (1995), Elsevier
    • (1995) Speech Coding and Synthesis
    • Talkin, D.1
  • 45
    • 0035280043 scopus 로고    scopus 로고
    • A comparison of auditory and blind separation techniques for speech segregation
    • van der Kouwe A.J.W., Wang D.L., and Brown G.J. A comparison of auditory and blind separation techniques for speech segregation. IEEE Trans. Speech Audio Process. 9 3 (2001) 189-195
    • (2001) IEEE Trans. Speech Audio Process. , vol.9 , Issue.3 , pp. 189-195
    • van der Kouwe, A.J.W.1    Wang, D.L.2    Brown, G.J.3
  • 46
    • 0033707902 scopus 로고    scopus 로고
    • Virtanen, T., Klapuri, A., 2000. Separation of harmonic sound sources using sinusoidal modeling. In: Proc. ICASSP-2000, June 2000, pp. 765-768.
  • 47
    • 84892233308 scopus 로고    scopus 로고
    • On ideal binary mask as the computational goal of auditory scene analysis
    • Divenyi P. (Ed), Kluwer Academic, Norwell MA
    • Wang D.L. On ideal binary mask as the computational goal of auditory scene analysis. In: Divenyi P. (Ed). Speech Separation by Humans and Machines (2005), Kluwer Academic, Norwell MA 181-197
    • (2005) Speech Separation by Humans and Machines , pp. 181-197
    • Wang, D.L.1
  • 48
    • 0032682770 scopus 로고    scopus 로고
    • Separation of speech from interfering sounds based on oscillatory correlation
    • Wang D.L., and Brown G.J. Separation of speech from interfering sounds based on oscillatory correlation. IEEE Trans. Neural Networks 10 May (1999) 684-697
    • (1999) IEEE Trans. Neural Networks , vol.10 , Issue.May , pp. 684-697
    • Wang, D.L.1    Brown, G.J.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.