SCOPUS 정보 검색 플랫폼

Volumn 49, Issue 6, 2007, Pages 464-476

Monaural speech segregation based on fusion of source-driven with model-driven techniques

(3) Radfar, Mohammad H a,b Dansereau, Richard M a Sayadiyan, Abolghasem b

b AMIRKABIR UNIVERSITY OF TECHNOLOGY (Iran)

Author keywords

CASA; Envelope extraction; Harmonic modelling; MIXMAX estimator; Monaural speech segregation; Multi pitch tracking; Speech coding; Speech processing; Vector quantization

Indexed keywords

HARMONIC GENERATION; MEAN SQUARE ERROR; SPEECH ANALYSIS; SPEECH SYNTHESIS; VECTOR QUANTIZATION;

CODEBOOKS; COMPUTATIONAL AUDITORY SCENE ANALYSIS (CASA); HARMONIC SYNTHESIZERS; SPEECH SEGREGATION;

SPEECH CODING;

EID: 34250023466 PISSN: 01676393 EISSN: None Source Type: Journal
DOI: 10.1016/j.specom.2007.04.007 Document Type: Article

Times cited : (23)

References (49)

1
- 11144316019
- Decoding speech in the presence of other sources speech
- Barker J.P., Cooke M.P., and Ellis D.P.W. Decoding speech in the presence of other sources speech. Speech Comm. 45 1 (2005) 5-25
- (2005) Speech Comm. , vol.45 , Issue.1 , pp. 5-25
- Barker, J.P.¹ Cooke, M.P.² Ellis, D.P.W.³

2
- 4544386386
- Beierholm, T., Pedersen, B.D., Winther, O., 2004. Low complexity Bayesian single channel source separation. In: Proc. ICASSP-04, May 2004, Vol. 5, pp. 529-532.

3
- 0029411030
- An information-maximization approach to blind separation and blind deconvolution
- Bell A.J., and Sejnowski T.J. An information-maximization approach to blind separation and blind deconvolution. Neural Comput. 7 (1995) 1129-1159
- (1995) Neural Comput. , vol.7 , pp. 1129-1159
- Bell, A.J.¹ Sejnowski, T.J.²

4
- 0028531926
- Auditory scene analysis
- Brown G.J., and Cooke M. Auditory scene analysis. Computer Speech Lang. 8 4 (1994) 297-336
- (1994) Computer Speech Lang. , vol.8 , Issue.4 , pp. 297-336
- Brown, G.J.¹ Cooke, M.²

5
- 0036754453
- Speech enhancement using a mixture-maximum model
- Burshtein D., and Gannot S. Speech enhancement using a mixture-maximum model. IEEE Trans. Speech Audio Process. 10 6 (2002) 341-351
- (2002) IEEE Trans. Speech Audio Process. , vol.10 , Issue.6 , pp. 341-351
- Burshtein, D.¹ Gannot, S.²

6
- 0027307718
- Chazan, D., Stettiner, Y., Malah, D., 1993. Optimal multi-pitch estimation using the EM algorithm for co-channel speech separation. In: Proc. ICASSP-93, April 1993, pp. 728-731.

7
- 14944373601
- Vector quantization of harmonic magnitudes in speech coding applications: a survey and new technique
- Chu W.C. Vector quantization of harmonic magnitudes in speech coding applications: a survey and new technique. EURASIP J. Appl. Signal Process. 17 (2004) 2601-2613
- (2004) EURASIP J. Appl. Signal Process. , vol.17 , pp. 2601-2613
- Chu, W.C.¹

8
- 33750368310
- An audio-visual corpus for speech perception and automatic speech recognition
- Cooke M.P., Barker J., Cunningham S.P., and Shao X. An audio-visual corpus for speech perception and automatic speech recognition. JASA 120 (2006) 2421-2424
- (2006) JASA , vol.120 , pp. 2421-2424
- Cooke, M.P.¹ Barker, J.² Cunningham, S.P.³ Shao, X.⁴

9
- 0003641574
- Springer-Verlag
- de Boor C. A Practical Guide to Splines (1978), Springer-Verlag
- (1978) A Practical Guide to Splines
- de Boor, C.¹

10
- 0024909863
- On the application of hidden Markov models for enhancing noisy speech
- Ephraim Y., Malah D., and H Juang B. On the application of hidden Markov models for enhancing noisy speech. IEEE Trans. Acoust. Speech Signal Process. 37 (1998) 1846-1865
- (1998) IEEE Trans. Acoust. Speech Signal Process. , vol.37 , pp. 1846-1865
- Ephraim, Y.¹ Malah, D.² H Juang, B.³

11
- 0031619075
- Eriksson, T., Hong-Goo, K., Stylianou, Y., 1998. Quantization of the spectral envelope for sinusoidal coders. In: Proc. ICASSP-98, May 1998, Vol. 1, pp. 37-40.

12
- 0004110342
- MIT Press, Cambridge, Mass
- Fant G. Speech Sounds and Features (1973), MIT Press, Cambridge, Mass
- (1973) Speech Sounds and Features
- Fant, G.¹

13
- 33845954761
- A Bayesian approach for blind separation of sparse sources
- Fevotte C., and Godsill S.J. A Bayesian approach for blind separation of sparse sources. IEEE Trans. Speech Audio Process. 4 99 (2005) 1-15
- (2005) IEEE Trans. Speech Audio Process. , vol.4 , Issue.99 , pp. 1-15
- Fevotte, C.¹ Godsill, S.J.²

14
- 0003959189
- Kluwer Academic, Norwell MA
- Gersho A., and Gray R.M. Vector Quantization and Signal Compression (1992), Kluwer Academic, Norwell MA
- (1992) Vector Quantization and Signal Compression
- Gersho, A.¹ Gray, R.M.²

15
- 0038132749
- A variational method for learning sparse and overcomplete representations
- Girolami M. A variational method for learning sparse and overcomplete representations. Neural Comput. 13 11 (2001) 2517-2532
- (2001) Neural Comput. , vol.13 , Issue.11 , pp. 2517-2532
- Girolami, M.¹

16
- 34250014858
- Hanson, B.A., Wong, D.Y., 1984. The harmonic magnitude suppression (HMS) technique for intelligibility enhancement in the presence of interfering speech. In: Proc. ICASSP-84, March 1984, Vol. 9, pp. 65-68.

17
- 4644265990
- Monaural speech segregation based on pitch tracking and amplitude modulation
- Hu G., and Wang D.L. Monaural speech segregation based on pitch tracking and amplitude modulation. IEEE Trans. Neural Networks 15 5 (2004) 1135-1150
- (2004) IEEE Trans. Neural Networks , vol.15 , Issue.5 , pp. 1135-1150
- Hu, G.¹ Wang, D.L.²

18
- 84899014722
- Jang, G.J., Lee, T.W., 2003. A probabilistic approach to single channel source separation. In: Proc. Advances in Neural Information Processing Systems, pp. 1173-1180.

19
- 84963932580
- Kameoka, H., Nishimoto, T., Sagayama, S., 2004. Multi-pitch trajectory estimation of concurrent speech based on harmonic GMM and nonlinear Kalman filtering. In: INTERSPEECH-2004, October 2004, Vol. 1, pp. 2433-2436.

20
- 4644257621
- Kristjansson, T., Attias, H., Hershey, J., 2004. Single microphone source separation using high resolution signal reconstruction. In: Proc. ICASSP-04, May 2004, pp. 817-820.

21
- 85117866826
- Kwon, Y.H., Park, D.J., Ihm, B.C., 2000. Simplified pitch detection algorithm of mixed speech signals. In: Proc. ISCAS-83, May 2000, Vol. 3, pp. 722-725.

22
- 0032624821
- Blind source separation of more sources than mixtures using overcomplete representations
- Lee T.-W., Lewicki M.S., Girolami M., and Sejnowski T.J. Blind source separation of more sources than mixtures using overcomplete representations. IEEE Signal Process. Lett. 6 4 (1999) 87-90
- (1999) IEEE Signal Process. Lett. , vol.6 , Issue.4 , pp. 87-90
- Lee, T.-W.¹ Lewicki, M.S.² Girolami, M.³ Sejnowski, T.J.⁴

23
- 84940406551
- Martin, P., 1982. Comparison of pitch detection by cepstrum and spectral comb analysis. In: Proc. ICASSP-82, May 1982, Vol. 7, pp. 180-183.

24
- 0001935942
- Sinusoidal coding
- Kleijn W., and Paliwal K. (Eds), Elsevier
- McAulay R.J., and Quatieri T.F. Sinusoidal coding. In: Kleijn W., and Paliwal K. (Eds). Speech Coding and Synthesis (1995), Elsevier
- (1995) Speech Coding and Synthesis
- McAulay, R.J.¹ Quatieri, T.F.²

25
- 0003789815
- Academic, San Diego
- Moore B.C.J. An Introduction to the Psychology of Hearing. fourth ed. (1997), Academic, San Diego
- (1997) An Introduction to the Psychology of Hearing. fourth ed.
- Moore, B.C.J.¹

26
- 0031237388
- Cochannel speaker separation by harmonic enhancement and suppression
- Morgan D.P., George E.B., Lee L.T., and Key S.M. Cochannel speaker separation by harmonic enhancement and suppression. IEEE Trans. Acoust. Speech Signal Process. 5 5 (1997) 407-424
- (1997) IEEE Trans. Acoust. Speech Signal Process. , vol.5 , Issue.5 , pp. 407-424
- Morgan, D.P.¹ George, E.B.² Lee, L.T.³ Key, S.M.⁴

27
- 0024753593
- Speech recognition using noise-adaptive prototypes
- Nadas A., Nahamoo D., and Picheny M.A. Speech recognition using noise-adaptive prototypes. IEEE Trans. Acoust. Speech Signal Process. 37 10 (1989) 1495-1503
- (1989) IEEE Trans. Acoust. Speech Signal Process. , vol.37 , Issue.10 , pp. 1495-1503
- Nadas, A.¹ Nahamoo, D.² Picheny, M.A.³

28
- 0023168574
- Naylor, J.A., Boll, S.F., 1987. Techniques for suppression of an interfering talker in co-channel speech. In: Proc. ICASSP-87, April 1987, Vol. 1, pp. 205-208.

29
- 13544259544
- On the usefulness of STFT phase spectrum in human listening tests
- Paliwal K.K., and Alsteris L.D. On the usefulness of STFT phase spectrum in human listening tests. Speech Comm. 45 2 (2005) 153-170
- (2005) Speech Comm. , vol.45 , Issue.2 , pp. 153-170
- Paliwal, K.K.¹ Alsteris, L.D.²

30
- 0017004953
- Separation of speech from interfering speech by means of harmonic selection
- Parsons T.W. Separation of speech from interfering speech by means of harmonic selection. J. Acoust. Soc. Amer. 60 Aug. (1976) 911-918
- (1976) J. Acoust. Soc. Amer. , vol.60 , Issue.Aug , pp. 911-918
- Parsons, T.W.¹

31
- 0019606564
- The spectral envelope vocoder
- Paul D.B. The spectral envelope vocoder. IEEE Trans. Acoust. Speech Signal Process. ASSP-29 4 (1981) 786-794
- (1981) IEEE Trans. Acoust. Speech Signal Process. , vol.ASSP-29 , Issue.4 , pp. 786-794
- Paul, D.B.¹

32
- 0043069843
- Squared error as a measure of perceived phase distortion
- Pobloth H., and Kleijn W.B. Squared error as a measure of perceived phase distortion. J. Acoust. Soc. Amer. 114 2 (2003) 1081-1094
- (2003) J. Acoust. Soc. Amer. , vol.114 , Issue.2 , pp. 1081-1094
- Pobloth, H.¹ Kleijn, W.B.²

33
- 0025256257
- An approach to co-channel talker interference suppression using a sinusoidal model for speech
- Quatieri T.F., and Danisewicz R.G. An approach to co-channel talker interference suppression using a sinusoidal model for speech. IEEE Trans. Acoust. Speech Signal Process. 38 (1990) 56-69
- (1990) IEEE Trans. Acoust. Speech Signal Process. , vol.38 , pp. 56-69
- Quatieri, T.F.¹ Danisewicz, R.G.²

34
- 0003425258
- Prentice-Hall
- Rabiner L.R., and Schafer R.W. Digital Processing of Speech Signals (1978), Prentice-Hall
- (1978) Digital Processing of Speech Signals
- Rabiner, L.R.¹ Schafer, R.W.²

35
- 34249994486
- A non-linear minimum mean square error estimator for the mixture-maximization approximation
- Radfar M.H., Banihashemi A.H., Dansereau R.M., and Sayadiyan A. A non-linear minimum mean square error estimator for the mixture-maximization approximation. Electron. Lett. 42 12 (2006) 75-76
- (2006) Electron. Lett. , vol.42 , Issue.12 , pp. 75-76
- Radfar, M.H.¹ Banihashemi, A.H.² Dansereau, R.M.³ Sayadiyan, A.⁴

36
- 85009074940
- Reddy, A.M., Raj, B., 2004. A minimum mean squared error estimator for single channel speaker separation. In: INTERSPEECH-2004, October 2004, pp. 2445-2448.

37
- 4544247508
- Reyes-Gomez, M.J., Ellis, D., Jojic, N., 2004. Multiband audio modeling for single channel acoustic source separation. In: Proc. ICASSP-04, May 2004, Vol. 5, pp. 641-644.

38
- 34249997972
- Roweis, S., 2000. One microphone source separation. In: Proc. Neural Information Processing Systems, pp. 793-799.

39
- 85009230793
- Rowies, S.T., 2003. Factorial models and refiltering for speech separation and denoising. In: EUROSPEECH-03, May 2003, Vol. 7, pp. 1009-1012.

40
- 0032166087
- Hmm-based strategies for enhancement of speech signals embedded in nonstationary noise
- Sameti H., Sheikzadeh H., Li D., and Brennan R.L. Hmm-based strategies for enhancement of speech signals embedded in nonstationary noise. IEEE Trans. Acoust. Speech Signal Process. 6 (1998) 445-455
- (1998) IEEE Trans. Acoust. Speech Signal Process. , vol.6 , pp. 445-455
- Sameti, H.¹ Sheikzadeh, H.² Li, D.³ Brennan, R.L.⁴

41
- 0020497760
- Secrest, B., Doddington, G., 1983. An integrated pitch tracking algorithm for speech systems. In: Proc. ICASSP-83, April 1983, Vol. 8, pp. 1352-1355.

42
- 0003411868
- McGraw-Hill
- Spiegel M.R. Schaum's Mathematical Handbook of Formulas and Tables. second ed. (1998), McGraw-Hill
- (1998) Schaum's Mathematical Handbook of Formulas and Tables. second ed.
- Spiegel, M.R.¹

43
- 0001455934
- Robust pitch tracking
- Kleijn W., and Paliwal K. (Eds), Elsevier
- Talkin D. Robust pitch tracking. In: Kleijn W., and Paliwal K. (Eds). Speech Coding and Synthesis (1995), Elsevier
- (1995) Speech Coding and Synthesis
- Talkin, D.¹

44
- 0034319894
- A computationally efficient multipitch analysis model
- Tolonen D., and Karjalainen M. A computationally efficient multipitch analysis model. IEEE Trans. Acoust. Speech Signal Process. 8 (2000) 708-716
- (2000) IEEE Trans. Acoust. Speech Signal Process. , vol.8 , pp. 708-716
- Tolonen, D.¹ Karjalainen, M.²

45
- 0035280043
- A comparison of auditory and blind separation techniques for speech segregation
- van der Kouwe A.J.W., Wang D.L., and Brown G.J. A comparison of auditory and blind separation techniques for speech segregation. IEEE Trans. Speech Audio Process. 9 3 (2001) 189-195
- (2001) IEEE Trans. Speech Audio Process. , vol.9 , Issue.3 , pp. 189-195
- van der Kouwe, A.J.W.¹ Wang, D.L.² Brown, G.J.³

46
- 0033707902
- Virtanen, T., Klapuri, A., 2000. Separation of harmonic sound sources using sinusoidal modeling. In: Proc. ICASSP-2000, June 2000, pp. 765-768.

47
- 84892233308
- On ideal binary mask as the computational goal of auditory scene analysis
- Divenyi P. (Ed), Kluwer Academic, Norwell MA
- Wang D.L. On ideal binary mask as the computational goal of auditory scene analysis. In: Divenyi P. (Ed). Speech Separation by Humans and Machines (2005), Kluwer Academic, Norwell MA 181-197
- (2005) Speech Separation by Humans and Machines , pp. 181-197
- Wang, D.L.¹

48
- 0032682770
- Separation of speech from interfering sounds based on oscillatory correlation
- Wang D.L., and Brown G.J. Separation of speech from interfering sounds based on oscillatory correlation. IEEE Trans. Neural Networks 10 May (1999) 684-697
- (1999) IEEE Trans. Neural Networks , vol.10 , Issue.May , pp. 684-697
- Wang, D.L.¹ Brown, G.J.²

49
- 0037767686
- A multipitch tracking algorithm for noisy speech
- Wu M., Wang D.L., and Brown G.J. A multipitch tracking algorithm for noisy speech. IEEE Trans. Acoust. Speech Signal Process. 11 3 (2003) 229-241
- (2003) IEEE Trans. Acoust. Speech Signal Process. , vol.11 , Issue.3 , pp. 229-241
- Wu, M.¹ Wang, D.L.² Brown, G.J.³

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.