SCOPUS 정보 검색 플랫폼

IEEE Transactions on Audio, Speech and Language Processing

Volumn 18, Issue 8, 2010, Pages 2067-2079

A tandem algorithm for pitch estimation and voiced speech segregation

(2) Hu, Guoning a,b Wang, DeLiang c

a OHIO STATE UNIVERSITY (United States)

b AOL Truveo Video Search (United States)

c The Ohio State University (United States)

Author keywords

Computational auditory scene analysis (CASA); Iterative procedure; Pitch estimation; Speech segregation; Tandem algorithm

Indexed keywords

COMPUTATIONAL AUDITORY SCENE ANALYSIS; ITERATIVE PROCEDURES; PITCH ESTIMATION; SPEECH SEGREGATION; TANDEM ALGORITHM;

ACOUSTIC VARIABLES MEASUREMENT; ALGORITHMS; ESTIMATION; ITERATIVE METHODS; PATIENT REHABILITATION; SEGREGATION (METALLOGRAPHY); SPEECH ANALYSIS; SPEECH COMMUNICATION;

CONTINUOUS SPEECH RECOGNITION;

EID: 77955695149 PISSN: 15587916 EISSN: None Source Type: Journal
DOI: 10.1109/TASL.2010.2041110 Document Type: Article

Times cited : (314)

References (40)

1
- 85093707396
- Enhanced pitch tracking and the processing of F0 contours for computer aided intonation teaching
- P.C. Bagshaw, S. Hiller, and M.A. Jack, "Enhanced pitch tracking and the processing of F0 contours for computer aided intonation teaching," in Proc. Eurospeech, 1993, pp. 1003-1006.
- (1993) Proc. Eurospeech , pp. 1003-1006
- Bagshaw, P.C.¹ Hiller, S.² Jack, M.A.³

2
- 11144316019
- Decoding speech in the presence of other sources
- J. Barker,M. Cooke, and D. Ellis, "Decoding speech in the presence of other sources," Speech Commun., vol. 45, pp. 5-25, 2005.
- (2005) Speech Commun. , vol.45 , pp. 5-25
- Barker, J.¹ Cooke, M.² Ellis, D.³

3
- 84937035392
- Estimating and interpreting the instantaneous frequency of a signal.I. Fundamentals
- B. Boashash, "Estimating and interpreting the instantaneous frequency of a signal.I. Fundamentals," Proc. IEEE, vol. 80, pp. 520-538, 1992.
- (1992) Proc. IEEE , vol.80 , pp. 520-538
- Boashash, B.¹

4
- 84941871268
- Estimating and interpreting the instantaneous frequency of a signal. II. Algorithms and applications
- Apr.
- B. Boashash, "Estimating and interpreting the instantaneous frequency of a signal. II. Algorithms and applications," Proc. IEEE, vol. 80, no. 4, pp. 540-568, Apr. 1992.
- (1992) Proc. IEEE , vol.80 , Issue.4 , pp. 540-568
- Boashash, B.¹

5
- 0038120523
- P. Boersma and D. Weenink, "Praat: Doing Phonetics by Computer," 2004.
- (2004) Praat: Doing Phonetics by Computer
- Boersma, P.¹ Weenink, D.²

6
- 0003684441
- Cambridge, MA: MIT Press
- A.S. Bregman, Auditory Scene Analysis. Cambridge, MA: MIT Press, 1990.
- (1990) Auditory Scene Analysis
- Bregman, A.S.¹

7
- 0000583248
- Probabilistic interpretation of feedforward classification network outputs, with relationships to statistical pattern recognition
- F. Fogelman- Soulie and J. Herault, Eds. New York: Springer
- J. Bridle, "Probabilistic interpretation of feedforward classification network outputs, with relationships to statistical pattern recognition," in Neurocomputing: Algorithms, Architectures, and Applications, F. Fogelman- Soulie and J. Herault, Eds. New York: Springer, 1989, pp. 227-236.
- (1989) Neurocomputing: Algorithms, Architectures, and Applications , pp. 227-236
- Bridle, J.¹

8
- 0028531926
- Computational auditory scene analysis
- G.J. Brown and M. Cooke, "Computational auditory scene analysis," Comput. Speech Lang., vol. 8, pp. 297-336, 1994.
- (1994) Comput. Speech Lang. , vol.8 , pp. 297-336
- Brown, G.J.¹ Cooke, M.²

9
- 33845354768
- Isolating the energetic component of speech-on-speech masking with ideal timefrequency segregation
- D.S. Brungart, P.S. Chang, B.D. Simpson, and D.L.Wang, "Isolating the energetic component of speech-on-speech masking with ideal timefrequency segregation," J. Acoust. Soc. Amer., vol. 120, pp. 4007-4018, 2006.
- (2006) J. Acoust. Soc. Amer. , vol.120 , pp. 4007-4018
- Brungart, D.S.¹ Chang, P.S.² Simpson, B.D.³ Wang, D.L.⁴

10
- 0003479143
- Cambridge, U.K.: Cambridge Univ. Press
- M. Cooke, Modelling Auditory Processing and Organization. Cambridge, U.K.: Cambridge Univ. Press, 1993.
- (1993) Modelling Auditory Processing and Organization
- Cooke, M.¹

11
- 85008537526
- Multiple F0 estimation
- D.L. Wang and G.J. Brown, Eds. Hoboken, NJ: Wiley and IEEE Press
- A. de Cheveigne, "Multiple F0 estimation," in Computational Auditory Scene Analysis: Principles, Algorithms, and Applications, D.L. Wang and G.J. Brown, Eds. Hoboken, NJ: Wiley and IEEE Press, 2006, pp. 45-79.
- (2006) Computational Auditory Scene Analysis: Principles, Algorithms, and Applications , pp. 45-79
- De Cheveigne, A.¹

12
- 0004191790
- New York: Thieme
- H. Dillon, Hearing Aids. New York: Thieme, 2001.
- (2001) Hearing Aids
- Dillon, H.¹

13
- 34248183857
- DARPA TIMIT acoustic-phonetic continuous speech corpus
- NISTIR 4930
- J. Garofolo et al., "DARPA TIMIT Acoustic-Phonetic Continuous Speech Corpus," National Inst. of Standards and Technol., 1993, NISTIR 4930.
- (1993) National Inst. of Standards and Technol.
- Garofolo, J.¹

14
- 0004220068
- 2nd ed. New York: Dover
- H. Helmholtz, On the Sensation of Tone, 2nd ed. New York: Dover, 1863.
- (1863) On the Sensation of Tone
- Helmholtz, H.¹

15
- 85045165251
- Ph.D dissertation, Biophysics Program, Ohio State Univ., Columbus
- G. Hu, "Monaural speech organization and segregation," Ph.D dissertation, Biophysics Program, Ohio State Univ., Columbus, 2006.
- (2006) Monaural Speech Organization and Segregation
- Hu, G.¹

16
- 4644265990
- Monaural speech segregation based on pitch tracking and amplitude modulation
- G. Hu and D.L. Wang, "Monaural speech segregation based on pitch tracking and amplitude modulation," IEEE Trans. Neural Netw., vol. 15, no. 5, pp. 1135-1150, 2004.
- (2004) IEEE Trans. Neural Netw. , vol.15 , Issue.5 , pp. 1135-1150
- Hu, G.¹ Wang, D.L.²

17
- 46049084696
- An auditory scene analysis approach to monaural speech segregation
- E. Hansler and G. Schmidt, Eds. Heidelberg, Germany: Springer
- G. Hu and D.L. Wang, "An auditory scene analysis approach to monaural speech segregation," in Topics in Acoustic Echo and Noise Control, E. Hansler and G. Schmidt, Eds. Heidelberg, Germany: Springer, 2006, pp. 485-515.
- (2006) Topics in Acoustic Echo and Noise Control , pp. 485-515
- Hu, G.¹ Wang, D.L.²

18
- 38849102154
- Auditory segmentation based on onset and offset analysis
- Feb.
- G. Hu and D.L. Wang, "Auditory segmentation based on onset and offset analysis," IEEE Trans. Audio, Speech, Lang. Process., vol. 15, no. 2, pp. 396-405, Feb. 2007.
- (2007) IEEE Trans. Audio, Speech, Lang. Process. , vol.15 , Issue.2 , pp. 396-405
- Hu, G.¹ Wang, D.L.²

19
- 49249107353
- Segregation of unvoiced speech from nonspeech interference
- G. Hu and D.L. Wang, "Segregation of unvoiced speech from nonspeech interference," J. Acoust. Soc. Amer., vol. 124, pp. 1306-1319, 2008.
- (2008) J. Acoust. Soc. Amer. , vol.124 , pp. 1306-1319
- Hu, G.¹ Wang, D.L.²

20
- 70349209415
- Incorporating spectral subtraction and noise type for unvoiced speech segregation
- K. Hu and D.L. Wang, "Incorporating spectral subtraction and noise type for unvoiced speech segregation," in Proc. IEEE ICASSP, 2009, pp. 4425-4428.
- (2009) Proc. IEEE ICASSP , pp. 4425-4428
- Hu, K.¹ Wang, D.L.²

21
- 0004056285
- Upper Saddle River, NJ: Prentice-Hall
- X. Huang, A. Acero, and H.-W. Hon, Spoken Language processing: A Guide to Theory, Algorithms, and System Development. Upper Saddle River, NJ: Prentice-Hall, 2001.
- (2001) Spoken Language processing: A Guide to Theory, Algorithms, and System Development
- Huang, X.¹ Acero, A.² Hon, H.-W.³

22
- 0014568991
- IEEE recommended practice for speech quality measurements
- Sep.
- "IEEE recommended practice for speech quality measurements," IEEE Trans. Audio Electroacoust., vol. AE-17, no. 3, pp. 225-246, Sep. 1969.
- (1969) IEEE Trans. Audio Electroacoust. , vol.17 AE , Issue.3 , pp. 225-246

23
- 4644223415
- A temporal-analysis-based pitch estimation system for noisy speech with a comparative study of performance of recent systems
- Sep.
- A. Khurshid and S.L. Denham, "A temporal-analysis-based pitch estimation system for noisy speech with a comparative study of performance of recent systems," IEEE Trans. Neural Netw., vol. 15, no. 5, pp. 1112-1124, Sep. 2004.
- (2004) IEEE Trans. Neural Netw. , vol.15 , Issue.5 , pp. 1112-1124
- Khurshid, A.¹ Denham, S.L.²

24
- 50249167077
- Single and multiple F0 contour estimation through parametric spectrogram modeling of speech in noisy environments
- May
- J. Le Roux, H. Kameoka, N. Ono, A. de Cheveigne, and S. Sagayama, "Single and multiple F0 contour estimation through parametric spectrogram modeling of speech in noisy environments," IEEE Trans. Audio, Speech, Lang. Process., vol. 15, no. 4, pp. 1135-1145, May 2007.
- (2007) IEEE Trans. Audio, Speech, Lang. Process. , vol.15 , Issue.4 , pp. 1135-1145
- Le Roux, J.¹ Kameoka, H.² Ono, N.³ De Cheveigne, A.⁴ Sagayama, S.⁵

25
- 40749125179
- Factors influencing intelligibility of ideal binary- masked speech: Implications for noise reduction
- N. Li and P.C. Loizou, "Factors influencing intelligibility of ideal binary- masked speech: Implications for noise reduction," J. Acoust. Soc. Amer., vol. 123, pp. 1673-1682, 2008.
- (2008) J. Acoust. Soc. Amer. , vol.123 , pp. 1673-1682
- Li, N.¹ Loizou, P.C.²

26
- 40949108726
- Monaural speech separation based on computational auditory scene analysis and objective quality assessment of speech
- Nov.
- P. Li, Y. Guan, B. Xu, and W. Liu, "Monaural speech separation based on computational auditory scene analysis and objective quality assessment of speech," IEEE Trans. Audio, Speech, Lang. Process., vol. 14, no. 6, pp. 2014-2023, Nov. 2006.
- (2006) IEEE Trans. Audio, Speech, Lang. Process. , vol.14 , Issue.6 , pp. 2014-2023
- Li, P.¹ Guan, Y.² Xu, B.³ Liu, W.⁴

27
- 0031187171
- Speech recognition by machines and humans
- R.P. Lippmann, "Speech recognition by machines and humans," Speech Commun., vol. 22, pp. 1-16, 1997.
- (1997) Speech Commun. , vol.22 , pp. 1-16
- Lippmann, R.P.¹

28
- 34447100796
- Boca Raton, FL: CRC
- P.C. Loizou, Speech Enhancement: Theory and Practice. Boca Raton, FL: CRC, 2007.
- (2007) Speech Enhancement: Theory and Practice
- Loizou, P.C.¹

29
- 0003257037
- The prosody of speech: Melody and rhythm
- W.J. Hardcastle and J. Laver, Eds. Oxford, U.K.: Blackwell
- S.G. Nooteboom, "The prosody of speech: Melody and rhythm," in The Handbook of Phonetic Sciences, W.J. Hardcastle and J. Laver, Eds. Oxford, U.K.: Blackwell, 1997, pp. 640-673.
- (1997) The Handbook of Phonetic Sciences , pp. 640-673
- Nooteboom, S.G.¹

30
- 0141624530
- An efficient auditory filterbank based on the gammatone function
- R.D. Patterson, J. Holdsworth, I. Nimmo-Smith, and P. Rice, "An efficient auditory filterbank based on the gammatone function," MRC Applied Psychology Unit, 1988, 2341.
- (1988) MRC Applied Psychology Unit , vol.2341
- Patterson, R.D.¹ Holdsworth, J.² Nimmo-Smith, I.³ Rice, P.⁴

31
- 33845940172
- A maximum likelihood estimation of vocal-tract-related filter characteristics for single channel speech separation
- Article 84186
- M.H. Radfar, R.M. Dansereau, and A. Sayadiyan, "A maximum likelihood estimation of vocal-tract-related filter characteristics for single channel speech separation," EURASIP J. Audio Speech Music Process., vol. 2007, p. 15, 2007, Article 84186.
- (2007) EURASIP J. Audio Speech Music Process. , vol.2007 , pp. 15
- Radfar, M.H.¹ Dansereau, R.M.² Sayadiyan, A.³

32
- 56249144712
- Soft mask methods for single-channel speaker separation
- Aug.
- A.M. Reddy and B. Raj, "Soft mask methods for single-channel speaker separation," IEEE Trans. Audio, Speech, Lang. Process., vol. 15, no. 6, pp. 1766-1776, Aug. 2007.
- (2007) IEEE Trans. Audio, Speech, Lang. Process. , vol.15 , Issue.6 , pp. 1766-1776
- Reddy, A.M.¹ Raj, B.²

33
- 0031124228
- A pitch determination and voiced/unvoiced decision algorithm for noisy speech
- J. Rouat, Y.C. Liu, and D. Morissette, "A pitch determination and voiced/unvoiced decision algorithm for noisy speech," Speech Commun., vol. 21, pp. 191-207, 1997.
- (1997) Speech Commun. , vol.21 , pp. 191-207
- Rouat, J.¹ Liu, Y.C.² Morissette, D.³

34
- 0000646059
- Learning internal representations by error propagation
- D.E. Rumelhart and J.L. McClell, Eds. Cambridge, MA: MIT Press
- D.E. Rumelhart, G.E. Hinton, and R.J. Williams, "Learning internal representations by error propagation," in Parallel Distributed Processing, D.E. Rumelhart and J.L. McClell, Eds. Cambridge, MA: MIT Press, 1986, pp. 318-362.
- (1986) Parallel Distributed Processing , pp. 318-362
- Rumelhart, D.E.¹ Hinton, G.E.² Williams, R.J.³

35
- 46049084086
- Ph.D. dissertation, Dept. of Comput. Sci. Eng., Ohio State Univ., Columbus, OH
- Y. Shao, "Sequential Organization in Computational Auditory Scene Analysis," Ph.D. dissertation, Dept. of Comput. Sci. Eng., Ohio State Univ., Columbus, OH, 2007.
- (2007) Sequential Organization in Computational Auditory Scene Analysis
- Shao, Y.¹

36
- 15844428932
- Human and machine consonant recognition
- J.J. Sroka and L.D. Braida, "Human and machine consonant recognition," Speech Commun., vol. 45, pp. 410-423, 2005.
- (2005) Speech Commun. , vol.45 , pp. 410-423
- Sroka, J.J.¹ Braida, L.D.²

37
- 84892233308
- On ideal binary mask as the computational goal of auditory scene analysis
- P. Divenyi, Ed. Norwell, MA: Kluwer
- D.L. Wang, "On ideal binary mask as the computational goal of auditory scene analysis," in Speech Separation by Humans and Machines, P. Divenyi, Ed. Norwell, MA: Kluwer, 2005, pp. 181-197.
- (2005) Speech Separation by Humans and Machines , pp. 181-197
- Wang, D.L.¹

38
- 0032682770
- Separation of speech from interfering sounds based on oscillatory correlation
- May
- D.L. Wang and G.J. Brown, "Separation of speech from interfering sounds based on oscillatory correlation," IEEE Trans. Neural Netw., vol. 10, pp. 684-697, May 1999.
- (1999) IEEE Trans. Neural Netw. , vol.10 , pp. 684-697
- Wang, D.L.¹ Brown, G.J.²

39
- 82255178542
- D.L.Wang and G.J. Brown, Eds. Hoboken, NJ:Wiley and IEEE Press
- Computational Auditory Scene Analysis: Principles, Algorithms, and Applications, D.L.Wang and G.J. Brown, Eds. Hoboken, NJ:Wiley and IEEE Press, 2006.
- (2006) Computational Auditory Scene Analysis: Principles, Algorithms, and Applications

40
- 0037767686
- A multipitch tracking algorithm for noisy speech
- May
- M.Wu, D.L.Wang, and G.J. Brown, "A multipitch tracking algorithm for noisy speech," IEEE Trans. Speech Audio Process., vol. 11, no. 3, pp. 229-241, May 2003.
- (2003) IEEE Trans. Speech Audio Process. , vol.11 , Issue.3 , pp. 229-241
- Wu, M.¹ Wang, D.L.² Brown, G.J.³

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.