SCOPUS 정보 검색 플랫폼

IEEE Transactions on Neural Networks

Volumn 15, Issue 5, 2004, Pages 1135-1150

Monaural speech segregation based on pitch tracking and amplitude modulation

(2) Hu, Guoning a Wang, DeLiang a

a OHIO STATE UNIVERSITY (United States)

Author keywords

[No Author keywords available]

Indexed keywords

AMPLITUDE MODULATION; AUDITION; CORRELATION METHODS; HARMONIC ANALYSIS; MATHEMATICAL MODELS; SOUND RECORDING; SPEECH SYNTHESIS;

COMPUTATIONAL AUDITORY SCENE ANALYSIS; MONOAURAL SPEECH SEGREGATION; PITCH TRACKING;

SPEECH ANALYSIS;

EID: 4644265990 PISSN: 10459227 EISSN: None Source Type: Journal
DOI: 10.1109/TNN.2004.832812 Document Type: Article

Times cited : (359)

References (35)

1
- 0036649241
- Estimation of speech embedded in a reverberant and noisy environment by independent component analysis and wavelets
- July
- A. K. Barros, T. Rutkowski, F. Itakura, and N. Ohnishi, "Estimation of speech embedded in a reverberant and noisy environment by independent component analysis and wavelets," IEEE Trans. Neural Networks, vol. 13, pp. 888-893, July 2002.
- (2002) IEEE Trans. Neural Networks , vol.13 , pp. 888-893
- Barros, A.K.¹ Rutkowski, T.² Itakura, F.³ Ohnishi, N.⁴

2
- 0002888637
- Effects of a difference in fundamental frequency in separating two sentences
- A. R. Palmer, A. Rees, A. Q. Summerfield, and R. Meddis, Eds. London, U.K.: Whurr
- J. Bird and C. J. Darwin, "Effects of a difference in fundamental frequency in separating two sentences," in Psychophysical and Physiological Advances in Hearing, A. R. Palmer, A. Rees, A. Q. Summerfield, and R. Meddis, Eds. London, U.K.: Whurr, 1997.
- (1997) Psychophysical and Physiological Advances in Hearing
- Bird, J.¹ Darwin, C.J.²

3
- 0017804799
- On cochlear encoding: Potentialities and limitations of the reverse-correlation techniques
- E. de Boer and H. R. de Jongh, "On cochlear encoding: potentialities and limitations of the reverse-correlation techniques," J. Acoust. Soc. Amer., vol. 63, pp. 115-135, 1978.
- (1978) J. Acoust. Soc. Amer. , vol.63 , pp. 115-135
- de Boer, E.¹ de Jongh, H.R.²

4
- 0003684441
- Cambridge, MA: MIT Press
- A. S. Bregman, Auditory Scene Analysis. Cambridge, MA: MIT Press, 1990.
- (1990) Auditory Scene Analysis
- Bregman, A.S.¹

5
- 0028531926
- Computational auditory scene analysis
- G. J. Brown and M. P. Cooke, "Computational auditory scene analysis," Comput. Speech Language, vol. 8, pp. 297-336, 1994.
- (1994) Comput. Speech Language , vol.8 , pp. 297-336
- Brown, G.J.¹ Cooke, M.P.²

6
- 0032702589
- Temporal coding of periodicity pitch in the auditory system: An overview
- P. Cariani, "Temporal coding of periodicity pitch in the auditory system: an overview," Neural Plasticity, vol. 6, pp. 147-172, 1999.
- (1999) Neural Plasticity , vol.6 , pp. 147-172
- Cariani, P.¹

7
- 0028264314
- Comparing the fundamental frequencies of resolved and unresolved harmonics: Evidence for two pitch mechanisms?
- R. P. Carlyon and T. M. Shackleton, "Comparing the fundamental frequencies of resolved and unresolved harmonics: evidence for two pitch mechanisms?," J. Acoust. Soc. Amer., vol. 95, pp. 3541-3554, 1994.
- (1994) J. Acoust. Soc. Amer. , vol.95 , pp. 3541-3554
- Carlyon, R.P.¹ Shackleton, T.M.²

8
- 0003479143
- Cambridge, U.K.: Cambridge Univ. Press
- M. P. Cooke, Modeling Auditory Processing and Organization. Cambridge, U.K.: Cambridge Univ. Press, 1993.
- (1993) Modeling Auditory Processing and Organization
- Cooke, M.P.¹

9
- 0035342414
- Robust automatic speech recognition with missing and unreliable acoustic data
- M. P. Cooke, P. D. Green, L. Josifovski, and A. Vizinho, "Robust automatic speech recognition with missing and unreliable acoustic data," Speech Commun., vol. 34, pp. 267-285, 2001.
- (2001) Speech Commun. , vol.34 , pp. 267-285
- Cooke, M.P.¹ Green, P.D.² Josifovski, L.³ Vizinho, A.⁴

10
- 0037750051
- Sound Source Separation Via Computational Auditory Scene Analysis (CASA )-enhanced Beamforming
- Ph.D. dissertation, Dept. Elect. Comput. Eng., Northwestern Univ., Evanston, IL
- L. A. Drake, "Sound source separation via computational auditory scene analysis (CASA)-enhanced beamforming," Ph.D. dissertation, Dept. Elect. Comput. Eng., Northwestern Univ., Evanston, IL, 2001.
- (2001)
- Drake, L.A.¹

11
- 0003424145
- New York: Macmillan
- J. R. Deller, J. G. Proakis, and J. H. L. Hansen, Discrete-Time Processing of Speech Signals. New York: Macmillan, 1993.
- (1993) Discrete-Time Processing of Speech Signals
- Deller, J.R.¹ Proakis, J.G.² Hansen, J.H.L.³

12
- 0003794341
- Ph.D. dissertation, Dept. Elect. Eng. Comput. Sci., MIT, Cambridge, MA
- D. P. W. Ellis, "Prediction-Driven Computational Auditory Scene Analysis," Ph.D. dissertation, Dept. Elect. Eng. Comput. Sci., MIT, Cambridge, MA, 1996.
- (1996) Prediction-Driven Computational Auditory Scene Analysis
- Ellis, D.P.W.¹

13
- 0029345417
- A signal subspace approach for speech enhancement
- July
- Y. Ephraim and H. L. Trees, "A signal subspace approach for speech enhancement," IEEE Trans. Speech Audio Processing, vol. 3, pp. 251-266, July 1995.
- (1995) IEEE Trans. Speech Audio Processing , vol.3 , pp. 251-266
- Ephraim, Y.¹ Trees, H.L.²

14
- 0028312802
- Auditory models and human performance in tasks related to speech coding and speech recognition
- Jan
- O. Ghitza, "Auditory models and human performance in tasks related to speech coding and speech recognition," IEEE Trans. Speech Audio Processing, vol. 2, pp. 115-132, Jan. 1994.
- (1994) IEEE Trans. Speech Audio Processing , vol.2 , pp. 115-132
- Ghitza, O.¹

15
- 0004220068
- Braunschweig, Germany: Vieweg and Son
- H. Helmholtz, On the Sensations of Tone. Braunschweig, Germany: Vieweg & Son, 1863.
- (1863) On the Sensations of Tone
- Helmholtz, H.¹

16
- 17544384941
- Monaural speech segregation based on pitch tracking and amplitude modulation
- G. Hu and D. L. Wang, "Monaural speech segregation based on pitch tracking and amplitude modulation," in Int. Conf. Acoustics, Speech and Signal Processing, 2002, pp. 553-556.
- (2002) Int. Conf. Acoustics, Speech and Signal Processing , pp. 553-556
- Hu, G.¹ Wang, D.L.²

17
- 0004056285
- Englewood Cliffs, NJ: Prentice-Hall
- X. Huang, A. Acero, and H. W. Hon, Spoken Language Processing. Englewood Cliffs, NJ: Prentice-Hall, 2001.
- (2001) Spoken Language Processing
- Huang, X.¹ Acero, A.² Hon, H.W.³

18
- 0001463644
- A duplex theory of pitch perception
- J. C. R. Licklider, "A duplex theory of pitch perception," Experientia, vol. 7, pp. 128-134, 1951.
- (1951) Experientia , vol.7 , pp. 128-134
- Licklider, J.C.R.¹

19
- 0029345416
- A comparison of signal processing front ends for automatic word recognition
- July
- C. R. Jankowski, H. H. Vo, and R. P. Lippmann, "A comparison of signal processing front ends for automatic word recognition," IEEE Trans. Speech Audio Processing, vol. 3, pp. 286-293, July 1995.
- (1995) IEEE Trans. Speech Audio Processing , vol.3 , pp. 286-293
- Jankowski, C.R.¹ Vo, H.H.² Lippmann, R.P.³

20
- 0035472866
- Speech enhancement using a constrained iterative sinusoidal model
- Oct
- J. Jensen and J. H. L. Hansen, "Speech enhancement using a constrained iterative sinusoidal model," IEEE Trans. Speech Audio Processing, vol. 9, pp. 731-740, Oct. 2001.
- (2001) IEEE Trans. Speech Audio Processing , vol.9 , pp. 731-740
- Jensen, J.¹ Hansen, J.H.L.²

21
- 0030193445
- Two decades of array signal processing research: The parametric approach
- July
- H. Krim and M. Viberg, "Two decades of array signal processing research: the parametric approach," IEEE Signal Processing Mag., vol. 13, pp. 67-94, July 1996.
- (1996) IEEE Signal Processing Mag. , vol.13 , pp. 67-94
- Krim, H.¹ Viberg, M.²

22
- 84958900313
- Cambridge, MA: MIT Press
- W. J. M. Levelt, Speaking: From Intention to Articulation. Cambridge, MA: MIT Press, 1989.
- (1989) Speaking: From Intention to Articulation
- Levelt, W.J.M.¹

23
- 0035396555
- Noise power spectral density estimation based on optimal smoothing and minimum statistics
- July
- R. Martin, "Noise power spectral density estimation based on optimal smoothing and minimum statistics," IEEE Trans. Speech Audio Processing, vol. 9, pp. 504-512, July 2001.
- (2001) IEEE Trans. Speech Audio Processing , vol.9 , pp. 504-512
- Martin, R.¹

24
- 0023944462
- Simulation of auditory-neural transduction: Further studies
- R. Meddis, "Simulation of auditory-neural transduction: further studies," J. Acoust. Soc. Amer., vol. 83, pp. 1056-1063, 1988.
- (1988) J. Acoust. Soc. Amer. , vol.83 , pp. 1056-1063
- Meddis, R.¹

25
- 0030846123
- A unitary model of pitch perception
- R. Meddis and L. O'Mard, "A unitary model of pitch perception," J. Acoust. Soc. Amer., vol. 102, pp. 1811-1820, 1997.
- (1997) J. Acoust. Soc. Amer. , vol.102 , pp. 1811-1820
- Meddis, R.¹ O'Mard, L.²

26
- 0003789815
- 4th ed. San Diego, CA: Academic
- B. C. J. Moore, An Introduction to the Psychology of Hearing, 4th ed. San Diego, CA: Academic, 1997.
- (1997) an Introduction to the Psychology of Hearing
- Moore, B.C.J.¹

27
- 0142056390
- Appl. Psychol. Unit, Cambridge Univ., Cambridge, U.K., APU Rep. 2341
- R. D. Patterson, I. Nimmo-Smith, J. Holdsworth, and P. Rice, "An efficient auditory filterbank based on the gammatone function," Appl. Psychol. Unit, Cambridge Univ., Cambridge, U.K., APU Rep. 2341, 1988.
- (1988) an Efficient Auditory Filterbank Based on the Gammatone Function
- Patterson, R.D.¹ Nimmo-Smith, I.² Holdswort, J.³ Rice, P.⁴

28
- 0014271904
- The ear as a frequency analyzer II
- R. Plomp and A. M. Mimpen, "The ear as a frequency analyzer II," J. Acoust. Soc. Amer., vol. 43, pp. 764-767, 1968.
- (1968) J. Acoust. Soc. Amer. , vol.43 , pp. 764-767
- Plomp, R.¹ Mimpen, A.M.²

29
- 0003444613
- Mahwah, NJ: Lawrence Erlbaum
- D. F. Rosenthal and H. G. Okuno, Computational Auditory Scene Analysis. Mahwah, NJ: Lawrence Erlbaum, 1998.
- (1998) Computational Auditory Scene Analysis
- Rosenthal, D.F.¹ Okuno, H.G.²

30
- 0032166087
- HMM-based strategies for enhancement of speech signals embedded in nonstationary noise
- Sept
- H. Sameti, H. Sheikhzadeh, L. Deng, and R. L. Brennan, "HMM-based strategies for enhancement of speech signals embedded in nonstationary noise," IEEE Trans. Speech Audio Processing, vol. 6, pp. 445-455, Sept. 1998.
- (1998) IEEE Trans. Speech Audio Processing , vol.6 , pp. 445-455
- Sameti, H.¹ Sheikhzadeh, H.² Deng, L.³ Brennan, R.L.⁴

31
- 0002296637
- On the importance of time - A temporal representation of sound
- M. P. Cooke, S. Beet, and M. Crawford, Eds. New York: Wiley
- M. Slaney and R. F. Lyon, "On the importance of time - a temporal representation of sound," in Visual Representations of Speech Signals, M. P. Cooke, S. Beet, and M. Crawford, Eds. New York: Wiley, 1993, pp. 95-116.
- (1993) Visual Representations of Speech Signals , pp. 95-116
- Slaney, M.¹ Lyon, R.F.²

32
- 0030188146
- Primitive auditory segregation based on oscillatory correlation
- D. L. Wang, "Primitive auditory segregation based on oscillatory correlation," Cogn. Sci., vol. 20, pp. 409-456, 1996.
- (1996) Cogn. Sci. , vol.20 , pp. 409-456
- Wang, D.L.¹

33
- 0032682770
- Separation of speech from interfering sounds based on oscillatory correlation
- May
- D. L. Wang and G. J. Brown, "Separation of speech from interfering sounds based on oscillatory correlation," IEEE Trans. Neural Networks, vol. 10, pp. 684-697, May 1999.
- (1999) IEEE Trans. Neural Networks , vol.10 , pp. 684-697
- Wang, D.L.¹ Brown, G.J.²

34
- 0003982501
- Ph.D. dissertation, Dept. Elect. Eng., Stanford Univ., Stanford, CA
- M. Weintraub, "A Theory and Computational Model of Auditory Monaural Sound Separation," Ph.D. dissertation, Dept. Elect. Eng., Stanford Univ., Stanford, CA, 1985.
- (1985) A Theory and Computational Model of Auditory Monaural Sound Separation
- Weintraub, M.¹

35
- 0037767686
- A multipitch tracking algorithm for noisy speech
- May
- M. Wu, D. L. Wang, and G. J. Brown, "A multipitch tracking algorithm for noisy speech," IEEE Trans. Speech Audio Processing, vol. 11, pp. 229-241, May 2003.
- (2003) IEEE Trans. Speech Audio Processing , vol.11 , pp. 229-241
- Wu, M.¹ Wang, D.L.² Brown, G.J.³

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.