메뉴 건너뛰기




Volumn 18, Issue 8, 2010, Pages 2067-2079

A tandem algorithm for pitch estimation and voiced speech segregation

Author keywords

Computational auditory scene analysis (CASA); Iterative procedure; Pitch estimation; Speech segregation; Tandem algorithm

Indexed keywords

COMPUTATIONAL AUDITORY SCENE ANALYSIS; ITERATIVE PROCEDURES; PITCH ESTIMATION; SPEECH SEGREGATION; TANDEM ALGORITHM;

EID: 77955695149     PISSN: 15587916     EISSN: None     Source Type: Journal    
DOI: 10.1109/TASL.2010.2041110     Document Type: Article
Times cited : (314)

References (40)
  • 1
    • 85093707396 scopus 로고
    • Enhanced pitch tracking and the processing of F0 contours for computer aided intonation teaching
    • P.C. Bagshaw, S. Hiller, and M.A. Jack, "Enhanced pitch tracking and the processing of F0 contours for computer aided intonation teaching," in Proc. Eurospeech, 1993, pp. 1003-1006.
    • (1993) Proc. Eurospeech , pp. 1003-1006
    • Bagshaw, P.C.1    Hiller, S.2    Jack, M.A.3
  • 2
    • 11144316019 scopus 로고    scopus 로고
    • Decoding speech in the presence of other sources
    • J. Barker,M. Cooke, and D. Ellis, "Decoding speech in the presence of other sources," Speech Commun., vol. 45, pp. 5-25, 2005.
    • (2005) Speech Commun. , vol.45 , pp. 5-25
    • Barker, J.1    Cooke, M.2    Ellis, D.3
  • 3
    • 84937035392 scopus 로고
    • Estimating and interpreting the instantaneous frequency of a signal.I. Fundamentals
    • B. Boashash, "Estimating and interpreting the instantaneous frequency of a signal.I. Fundamentals," Proc. IEEE, vol. 80, pp. 520-538, 1992.
    • (1992) Proc. IEEE , vol.80 , pp. 520-538
    • Boashash, B.1
  • 4
    • 84941871268 scopus 로고
    • Estimating and interpreting the instantaneous frequency of a signal. II. Algorithms and applications
    • Apr.
    • B. Boashash, "Estimating and interpreting the instantaneous frequency of a signal. II. Algorithms and applications," Proc. IEEE, vol. 80, no. 4, pp. 540-568, Apr. 1992.
    • (1992) Proc. IEEE , vol.80 , Issue.4 , pp. 540-568
    • Boashash, B.1
  • 7
    • 0000583248 scopus 로고
    • Probabilistic interpretation of feedforward classification network outputs, with relationships to statistical pattern recognition
    • F. Fogelman- Soulie and J. Herault, Eds. New York: Springer
    • J. Bridle, "Probabilistic interpretation of feedforward classification network outputs, with relationships to statistical pattern recognition," in Neurocomputing: Algorithms, Architectures, and Applications, F. Fogelman- Soulie and J. Herault, Eds. New York: Springer, 1989, pp. 227-236.
    • (1989) Neurocomputing: Algorithms, Architectures, and Applications , pp. 227-236
    • Bridle, J.1
  • 8
    • 0028531926 scopus 로고
    • Computational auditory scene analysis
    • G.J. Brown and M. Cooke, "Computational auditory scene analysis," Comput. Speech Lang., vol. 8, pp. 297-336, 1994.
    • (1994) Comput. Speech Lang. , vol.8 , pp. 297-336
    • Brown, G.J.1    Cooke, M.2
  • 9
    • 33845354768 scopus 로고    scopus 로고
    • Isolating the energetic component of speech-on-speech masking with ideal timefrequency segregation
    • D.S. Brungart, P.S. Chang, B.D. Simpson, and D.L.Wang, "Isolating the energetic component of speech-on-speech masking with ideal timefrequency segregation," J. Acoust. Soc. Amer., vol. 120, pp. 4007-4018, 2006.
    • (2006) J. Acoust. Soc. Amer. , vol.120 , pp. 4007-4018
    • Brungart, D.S.1    Chang, P.S.2    Simpson, B.D.3    Wang, D.L.4
  • 13
    • 34248183857 scopus 로고
    • DARPA TIMIT acoustic-phonetic continuous speech corpus
    • NISTIR 4930
    • J. Garofolo et al., "DARPA TIMIT Acoustic-Phonetic Continuous Speech Corpus," National Inst. of Standards and Technol., 1993, NISTIR 4930.
    • (1993) National Inst. of Standards and Technol.
    • Garofolo, J.1
  • 15
    • 85045165251 scopus 로고    scopus 로고
    • Ph.D dissertation, Biophysics Program, Ohio State Univ., Columbus
    • G. Hu, "Monaural speech organization and segregation," Ph.D dissertation, Biophysics Program, Ohio State Univ., Columbus, 2006.
    • (2006) Monaural Speech Organization and Segregation
    • Hu, G.1
  • 16
    • 4644265990 scopus 로고    scopus 로고
    • Monaural speech segregation based on pitch tracking and amplitude modulation
    • G. Hu and D.L. Wang, "Monaural speech segregation based on pitch tracking and amplitude modulation," IEEE Trans. Neural Netw., vol. 15, no. 5, pp. 1135-1150, 2004.
    • (2004) IEEE Trans. Neural Netw. , vol.15 , Issue.5 , pp. 1135-1150
    • Hu, G.1    Wang, D.L.2
  • 17
    • 46049084696 scopus 로고    scopus 로고
    • An auditory scene analysis approach to monaural speech segregation
    • E. Hansler and G. Schmidt, Eds. Heidelberg, Germany: Springer
    • G. Hu and D.L. Wang, "An auditory scene analysis approach to monaural speech segregation," in Topics in Acoustic Echo and Noise Control, E. Hansler and G. Schmidt, Eds. Heidelberg, Germany: Springer, 2006, pp. 485-515.
    • (2006) Topics in Acoustic Echo and Noise Control , pp. 485-515
    • Hu, G.1    Wang, D.L.2
  • 18
    • 38849102154 scopus 로고    scopus 로고
    • Auditory segmentation based on onset and offset analysis
    • Feb.
    • G. Hu and D.L. Wang, "Auditory segmentation based on onset and offset analysis," IEEE Trans. Audio, Speech, Lang. Process., vol. 15, no. 2, pp. 396-405, Feb. 2007.
    • (2007) IEEE Trans. Audio, Speech, Lang. Process. , vol.15 , Issue.2 , pp. 396-405
    • Hu, G.1    Wang, D.L.2
  • 19
    • 49249107353 scopus 로고    scopus 로고
    • Segregation of unvoiced speech from nonspeech interference
    • G. Hu and D.L. Wang, "Segregation of unvoiced speech from nonspeech interference," J. Acoust. Soc. Amer., vol. 124, pp. 1306-1319, 2008.
    • (2008) J. Acoust. Soc. Amer. , vol.124 , pp. 1306-1319
    • Hu, G.1    Wang, D.L.2
  • 20
    • 70349209415 scopus 로고    scopus 로고
    • Incorporating spectral subtraction and noise type for unvoiced speech segregation
    • K. Hu and D.L. Wang, "Incorporating spectral subtraction and noise type for unvoiced speech segregation," in Proc. IEEE ICASSP, 2009, pp. 4425-4428.
    • (2009) Proc. IEEE ICASSP , pp. 4425-4428
    • Hu, K.1    Wang, D.L.2
  • 22
    • 0014568991 scopus 로고
    • IEEE recommended practice for speech quality measurements
    • Sep.
    • "IEEE recommended practice for speech quality measurements," IEEE Trans. Audio Electroacoust., vol. AE-17, no. 3, pp. 225-246, Sep. 1969.
    • (1969) IEEE Trans. Audio Electroacoust. , vol.17 AE , Issue.3 , pp. 225-246
  • 23
    • 4644223415 scopus 로고    scopus 로고
    • A temporal-analysis-based pitch estimation system for noisy speech with a comparative study of performance of recent systems
    • Sep.
    • A. Khurshid and S.L. Denham, "A temporal-analysis-based pitch estimation system for noisy speech with a comparative study of performance of recent systems," IEEE Trans. Neural Netw., vol. 15, no. 5, pp. 1112-1124, Sep. 2004.
    • (2004) IEEE Trans. Neural Netw. , vol.15 , Issue.5 , pp. 1112-1124
    • Khurshid, A.1    Denham, S.L.2
  • 24
    • 50249167077 scopus 로고    scopus 로고
    • Single and multiple F0 contour estimation through parametric spectrogram modeling of speech in noisy environments
    • May
    • J. Le Roux, H. Kameoka, N. Ono, A. de Cheveigne, and S. Sagayama, "Single and multiple F0 contour estimation through parametric spectrogram modeling of speech in noisy environments," IEEE Trans. Audio, Speech, Lang. Process., vol. 15, no. 4, pp. 1135-1145, May 2007.
    • (2007) IEEE Trans. Audio, Speech, Lang. Process. , vol.15 , Issue.4 , pp. 1135-1145
    • Le Roux, J.1    Kameoka, H.2    Ono, N.3    De Cheveigne, A.4    Sagayama, S.5
  • 25
    • 40749125179 scopus 로고    scopus 로고
    • Factors influencing intelligibility of ideal binary- masked speech: Implications for noise reduction
    • N. Li and P.C. Loizou, "Factors influencing intelligibility of ideal binary- masked speech: Implications for noise reduction," J. Acoust. Soc. Amer., vol. 123, pp. 1673-1682, 2008.
    • (2008) J. Acoust. Soc. Amer. , vol.123 , pp. 1673-1682
    • Li, N.1    Loizou, P.C.2
  • 26
    • 40949108726 scopus 로고    scopus 로고
    • Monaural speech separation based on computational auditory scene analysis and objective quality assessment of speech
    • Nov.
    • P. Li, Y. Guan, B. Xu, and W. Liu, "Monaural speech separation based on computational auditory scene analysis and objective quality assessment of speech," IEEE Trans. Audio, Speech, Lang. Process., vol. 14, no. 6, pp. 2014-2023, Nov. 2006.
    • (2006) IEEE Trans. Audio, Speech, Lang. Process. , vol.14 , Issue.6 , pp. 2014-2023
    • Li, P.1    Guan, Y.2    Xu, B.3    Liu, W.4
  • 27
    • 0031187171 scopus 로고    scopus 로고
    • Speech recognition by machines and humans
    • R.P. Lippmann, "Speech recognition by machines and humans," Speech Commun., vol. 22, pp. 1-16, 1997.
    • (1997) Speech Commun. , vol.22 , pp. 1-16
    • Lippmann, R.P.1
  • 29
    • 0003257037 scopus 로고    scopus 로고
    • The prosody of speech: Melody and rhythm
    • W.J. Hardcastle and J. Laver, Eds. Oxford, U.K.: Blackwell
    • S.G. Nooteboom, "The prosody of speech: Melody and rhythm," in The Handbook of Phonetic Sciences, W.J. Hardcastle and J. Laver, Eds. Oxford, U.K.: Blackwell, 1997, pp. 640-673.
    • (1997) The Handbook of Phonetic Sciences , pp. 640-673
    • Nooteboom, S.G.1
  • 31
    • 33845940172 scopus 로고    scopus 로고
    • A maximum likelihood estimation of vocal-tract-related filter characteristics for single channel speech separation
    • Article 84186
    • M.H. Radfar, R.M. Dansereau, and A. Sayadiyan, "A maximum likelihood estimation of vocal-tract-related filter characteristics for single channel speech separation," EURASIP J. Audio Speech Music Process., vol. 2007, p. 15, 2007, Article 84186.
    • (2007) EURASIP J. Audio Speech Music Process. , vol.2007 , pp. 15
    • Radfar, M.H.1    Dansereau, R.M.2    Sayadiyan, A.3
  • 32
    • 56249144712 scopus 로고    scopus 로고
    • Soft mask methods for single-channel speaker separation
    • Aug.
    • A.M. Reddy and B. Raj, "Soft mask methods for single-channel speaker separation," IEEE Trans. Audio, Speech, Lang. Process., vol. 15, no. 6, pp. 1766-1776, Aug. 2007.
    • (2007) IEEE Trans. Audio, Speech, Lang. Process. , vol.15 , Issue.6 , pp. 1766-1776
    • Reddy, A.M.1    Raj, B.2
  • 33
    • 0031124228 scopus 로고    scopus 로고
    • A pitch determination and voiced/unvoiced decision algorithm for noisy speech
    • J. Rouat, Y.C. Liu, and D. Morissette, "A pitch determination and voiced/unvoiced decision algorithm for noisy speech," Speech Commun., vol. 21, pp. 191-207, 1997.
    • (1997) Speech Commun. , vol.21 , pp. 191-207
    • Rouat, J.1    Liu, Y.C.2    Morissette, D.3
  • 34
    • 0000646059 scopus 로고
    • Learning internal representations by error propagation
    • D.E. Rumelhart and J.L. McClell, Eds. Cambridge, MA: MIT Press
    • D.E. Rumelhart, G.E. Hinton, and R.J. Williams, "Learning internal representations by error propagation," in Parallel Distributed Processing, D.E. Rumelhart and J.L. McClell, Eds. Cambridge, MA: MIT Press, 1986, pp. 318-362.
    • (1986) Parallel Distributed Processing , pp. 318-362
    • Rumelhart, D.E.1    Hinton, G.E.2    Williams, R.J.3
  • 36
    • 15844428932 scopus 로고    scopus 로고
    • Human and machine consonant recognition
    • J.J. Sroka and L.D. Braida, "Human and machine consonant recognition," Speech Commun., vol. 45, pp. 410-423, 2005.
    • (2005) Speech Commun. , vol.45 , pp. 410-423
    • Sroka, J.J.1    Braida, L.D.2
  • 37
    • 84892233308 scopus 로고    scopus 로고
    • On ideal binary mask as the computational goal of auditory scene analysis
    • P. Divenyi, Ed. Norwell, MA: Kluwer
    • D.L. Wang, "On ideal binary mask as the computational goal of auditory scene analysis," in Speech Separation by Humans and Machines, P. Divenyi, Ed. Norwell, MA: Kluwer, 2005, pp. 181-197.
    • (2005) Speech Separation by Humans and Machines , pp. 181-197
    • Wang, D.L.1
  • 38
    • 0032682770 scopus 로고    scopus 로고
    • Separation of speech from interfering sounds based on oscillatory correlation
    • May
    • D.L. Wang and G.J. Brown, "Separation of speech from interfering sounds based on oscillatory correlation," IEEE Trans. Neural Netw., vol. 10, pp. 684-697, May 1999.
    • (1999) IEEE Trans. Neural Netw. , vol.10 , pp. 684-697
    • Wang, D.L.1    Brown, G.J.2
  • 40
    • 0037767686 scopus 로고    scopus 로고
    • A multipitch tracking algorithm for noisy speech
    • May
    • M.Wu, D.L.Wang, and G.J. Brown, "A multipitch tracking algorithm for noisy speech," IEEE Trans. Speech Audio Process., vol. 11, no. 3, pp. 229-241, May 2003.
    • (2003) IEEE Trans. Speech Audio Process. , vol.11 , Issue.3 , pp. 229-241
    • Wu, M.1    Wang, D.L.2    Brown, G.J.3


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.