메뉴 건너뛰기




Volumn 19, Issue 6, 2011, Pages 1600-1609

Unvoiced Speech Segregation From Nonspeech Interference via CASA and Spectral Subtraction

Author keywords

Bayesian classification; computational auditory scene analysis (CASA); nonspeech interference; spectral subtraction; unvoiced speech segregation

Indexed keywords


EID: 85008054377     PISSN: 15587916     EISSN: 15587924     Source Type: Journal    
DOI: 10.1109/TASL.2010.2093893     Document Type: Article
Times cited : (53)

References (37)
  • 2
    • 0018320733 scopus 로고
    • Enhancement of speech corrupted by acoustic noise
    • M. Berouti, R. Schwartz, and J. Makhoul, “Enhancement of speech corrupted by acoustic noise,” in Proc. IEEE ICASSP, 1979, pp. 208–211.
    • (1979) Proc. IEEE ICASSP , pp. 208-211
    • Berouti, M.1    Schwartz, R.2    Makhoul, J.3
  • 3
    • 85008019627 scopus 로고    scopus 로고
    • December 27, Praat: Doing Phonetics by Computer, ver. 5.0.02 [Online]. Available: http://www.fon. hum.uva.nl/praat
    • P. Boersma and D. Weenink, December 27, 2007, Praat: Doing Phonetics by Computer, ver. 5.0.02 [Online]. Available: http://www.fon. hum.uva.nl/praat
    • (2007)
    • Boersma, P.1    Weenink, D.2
  • 5
    • 0028531926 scopus 로고
    • Computational auditory scene analysis
    • G. J. Brown and M. Cooke “Computational auditory scene analysis,” Comput. Speech Lang., vol. 8, pp. 297–336, 1994.
    • (1994) Comput. Speech Lang. , vol.8 , pp. 297-336
    • Brown, G.J.1    Cooke, M.2
  • 6
    • 33845354768 scopus 로고    scopus 로고
    • Isolating the energetic component of speech-on-speech masking with ideal time-frequency segregation
    • D. S. Brungart, P. S. Chang, B. D. Simpson, and D. L. Wang “Isolating the energetic component of speech-on-speech masking with ideal time-frequency segregation,” J. Acoust. Soc. Amer., vol. 120, pp. 4007–4018, 2006.
    • (2006) J. Acoust. Soc. Amer. , vol.120 , pp. 4007-4018
    • Brungart, D.S.1    Chang, P.S.2    Simpson, B.D.3    Wang, D.L.4
  • 8
    • 0004191790 scopus 로고    scopus 로고
    • New York: Thieme Medical Publishers
    • H. Dillon, Hearing Aids. New York: Thieme Medical Publishers, 2001.
    • (2001) Hearing Aids
    • Dillon, H.1
  • 10
    • 85045165251 scopus 로고    scopus 로고
    • Ph.D. dissertation, Biophys. Program, Ohio State Univ., Columbus
    • G. Hu, “Monaural Speech Organization and Segregation,” Ph.D. dissertation, Biophys. Program, Ohio State Univ., Columbus, 2006.
    • (2006) Monaural Speech Organization and Segregation
    • Hu, G.1
  • 11
    • 85008039975 scopus 로고    scopus 로고
    • 100 Nonspeech Sounds Online. [Online]. Available: http://www.cse.ohio-state.edu/pnl/corpus/HuCorpus.html
    • G. Hu, 2006, 100 Nonspeech Sounds Online. [Online]. Available: http://www.cse.ohio-state.edu/pnl/corpus/HuCorpus.html
    • (2006)
    • Hu, G.1
  • 12
    • 4644265990 scopus 로고    scopus 로고
    • Monaural speech segregation based on pitch tracking and amplitude modulation
    • Sep.
    • G. Hu and D. L. Wang, “Monaural speech segregation based on pitch tracking and amplitude modulation,” IEEE Trans. Neural Netw., vol. 15, no. 5, pp. 1135–1150, Sep. 2004.
    • (2004) IEEE Trans. Neural Netw. , vol.15 , Issue.5 , pp. 1135-1150
    • Hu, G.1    Wang, D.L.2
  • 13
    • 49249107353 scopus 로고    scopus 로고
    • Segregation of unvoiced speech from non-speech interference
    • G. Hu and D. L. Wang “Segregation of unvoiced speech from non-speech interference,” J. Acoust. Soc. Amer., vol. 124, pp. 1306–1319, 2008.
    • (2008) J. Acoust. Soc. Amer. , vol.124 , pp. 1306-1319
    • Hu, G.1    Wang, D.L.2
  • 14
    • 77955695149 scopus 로고    scopus 로고
    • A tandem algorithm for pitch estimation and voiced speech segregation
    • Nov.
    • G. Hu and D. L. Wang “A tandem algorithm for pitch estimation and voiced speech segregation,” IEEE Trans. Audio, Speech, Lang. Process., vol. 18, no. 8, pp. 2067–2079, Nov. 2010.
    • (2010) IEEE Trans. Audio, Speech, Lang. Process. , vol.18 , Issue.8 , pp. 2067-2079
    • Hu, G.1    Wang, D.L.2
  • 15
    • 70349209415 scopus 로고    scopus 로고
    • Incorporating spectral subtraction and noise type for unvoiced speech segregation
    • K. Hu and D. L. Wang, “Incorporating spectral subtraction and noise type for unvoiced speech segregation,” in Proc. IEEE ICASSP, 2009, pp. 4425–4428.
    • (2009) Proc. IEEE ICASSP , pp. 4425-4428
    • Hu, K.1    Wang, D.L.2
  • 16
    • 35248891610 scopus 로고    scopus 로고
    • A comparative intelligibility study of single-microphone noise reduction algorithms
    • Y. Hu and P. C. Loizou “A comparative intelligibility study of single-microphone noise reduction algorithms,” J. Acoust. Soc. Amer., vol. 122, no. 3, pp. 1777–1786, 2007.
    • (2007) J. Acoust. Soc. Amer. , vol.122 , Issue.3 , pp. 1777-1786
    • Hu, Y.1    Loizou, P.C.2
  • 17
    • 0014568991 scopus 로고
    • IEEE recommended practice for speech quality measurements
    • IEEE “IEEE recommended practice for speech quality measurements,” IEEE Trans. Audio Electroacoust., vol. AE-17, pp. 225–246, 1969.
    • (1969) IEEE Trans. Audio Electroacoust. , vol.AE-17 , pp. 225-246
  • 18
    • 65249103478 scopus 로고    scopus 로고
    • A supervised learning approach to monaural segregation of reverberant speech
    • May
    • Z. Jin and D. L. Wang, “A supervised learning approach to monaural segregation of reverberant speech,” IEEE Trans. Audio, Speech, Lang. Process., vol. 17, no. 4, pp. 625–638, May 2009.
    • (2009) IEEE Trans. Audio, Speech, Lang. Process. , vol.17 , Issue.4 , pp. 625-638
    • Jin, Z.1    Wang, D.L.2
  • 20
    • 40749125179 scopus 로고    scopus 로고
    • Factors influencing intelligibility of ideal binary-masked speech: Implications for noise reduction
    • N. Li and P. C. Loizou “Factors influencing intelligibility of ideal binary-masked speech: Implications for noise reduction,” J. Acoust. Soc. Amer., vol. 123, pp. 1673–1682, 2008.
    • (2008) J. Acoust. Soc. Amer. , vol.123 , pp. 1673-1682
    • Li, N.1    Loizou, P.C.2
  • 21
    • 40949108726 scopus 로고    scopus 로고
    • Monaural speech separation based on computational auditory scene analysis and objective quality assessment of speech
    • Nov.
    • P. Li, Y. Guan, B. Xu, and W. Liu “Monaural speech separation based on computational auditory scene analysis and objective quality assessment of speech,” IEEE Trans. Audio, Speech, Lang. Process., vol. 14, no. 6, pp. 2014–2023, Nov. 2006.
    • (2006) IEEE Trans. Audio, Speech, Lang. Process. , vol.14 , Issue.6 , pp. 2014-2023
    • Li, P.1    Guan, Y.2    Xu, B.3    Liu, W.4
  • 22
    • 58149196390 scopus 로고    scopus 로고
    • On the optimality of ideal binary time-frequency masks
    • Y. Li and D. L. Wang “On the optimality of ideal binary time-frequency masks,” Speech Commun., vol. 51, pp. 230–239, 2009.
    • (2009) Speech Commun. , vol.51 , pp. 230-239
    • Li, Y.1    Wang, D.L.2
  • 23
    • 34447101009 scopus 로고    scopus 로고
    • Noise estimation using speech/non-speech frame decision and subband spectral tracking
    • Z. Lin, R. A. Goubran, and R. M. Dansereau “Noise estimation using speech/non-speech frame decision and subband spectral tracking,” Speech Commun., vol. 49, pp. 542–557, 2007.
    • (2007) Speech Commun. , vol.49 , pp. 542-557
    • Lin, Z.1    Goubran, R.A.2    Dansereau, R.M.3
  • 25
    • 0023944462 scopus 로고
    • Simulation of auditory-neural transduction: Further studies
    • R. Meddis “Simulation of auditory-neural transduction: Further studies,” J. Acoust. Soc. Amer., vol. 83, pp. 1056–1063, 1988.
    • (1988) J. Acoust. Soc. Amer. , vol.83 , pp. 1056-1063
    • Meddis, R.1
  • 26
    • 0029252699 scopus 로고
    • On the probabilistic interpretation of neural network classifiers and discriminative training criteria
    • Feb.
    • H. Ney “On the probabilistic interpretation of neural network classifiers and discriminative training criteria,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 17, no. 2, pp. 107–119, Feb. 1995.
    • (1995) IEEE Trans. Pattern Anal. Mach. Intell. , vol.17 , Issue.2 , pp. 107-119
    • Ney, H.1
  • 27
    • 0141624530 scopus 로고
    • An efficient auditory filterbank based on the gammatone function
    • Cambridge, U.K., APU Rep. 2341.
    • R. D. Patterson, I. Nimmo-Smith, J. Holdsworth, and P. Rice, “An efficient auditory filterbank based on the gammatone function,” Appl. Psychol. Unit, 1988, Cambridge, U.K., APU Rep. 2341.
    • (1988) Appl. Psychol. Unit
    • Patterson, R.D.1    Nimmo-Smith, I.2    Holdsworth, J.3    Rice, P.4
  • 28
    • 48849091396 scopus 로고    scopus 로고
    • Single-channel speech separation using soft masking filtering
    • Nov.
    • M. H. Radfar and R. M. Dansereau “Single-channel speech separation using soft masking filtering,” IEEE Trans. Audio, Speech, Lang. Process., vol. 15, no. 8, pp. 2299–2310, Nov. 2007.
    • (2007) IEEE Trans. Audio, Speech, Lang. Process. , vol.15 , Issue.8 , pp. 2299-2310
    • Radfar, M.H.1    Dansereau, R.M.2
  • 29
    • 33845940172 scopus 로고    scopus 로고
    • A maximum likelihood estimation of vocal-tract-related filter characteristics for single channel speech separation
    • 10.1155/2007/84186, Article ID 84186, 15 pages
    • M. H. Radfar, R. M. Dansereau, and A. Sayadiyan, “A maximum likelihood estimation of vocal-tract-related filter characteristics for single channel speech separation,” EURASIP J. Audio, Speech, Music Process., vol. 2007, 2007, 10.1155/2007/84186, Article ID 84186, 15 pages.
    • (2007) EURASIP J. Audio, Speech, Music Process. , vol.2007
    • Radfar, M.H.1    Dansereau, R.M.2    Sayadiyan, A.3
  • 31
    • 51449109652 scopus 로고    scopus 로고
    • Codebook-based Bayesian speech enhancement for nonstationary environments
    • Feb.
    • S. Srinivasan, J. Samuelsson, and W. B. Kleijn “Codebook-based Bayesian speech enhancement for nonstationary environments,” IEEE Trans. Audio, Speech, Lang. Process., vol. 15, no. 2, pp. 441–452, Feb. 2007.
    • (2007) IEEE Trans. Audio, Speech, Lang. Process. , vol.15 , Issue.2 , pp. 441-452
    • Srinivasan, S.1    Samuelsson, J.2    Kleijn, W.B.3
  • 33
    • 70349448618 scopus 로고    scopus 로고
    • An algorithm for speech segregation of co-channel speech
    • S. Vishnubhotla and C. Y. Espy-Wilson, “An algorithm for speech segregation of co-channel speech,” in Proc. IEEE ICASSP, 2009, pp. 109–112.
    • (2009) Proc. IEEE ICASSP , pp. 109-112
    • Vishnubhotla, S.1    Espy-Wilson, C.Y.2
  • 34
    • 84892233308 scopus 로고    scopus 로고
    • On ideal binary mask as the computational goal of auditory scene analysis
    • P. Divenyi, Ed. Norwell, MA: Kluwer
    • D. L. Wang, “On ideal binary mask as the computational goal of auditory scene analysis,” in Speech Separation by Humans and Machines, P. Divenyi, Ed. Norwell, MA: Kluwer, 2005, pp. 181–197.
    • (2005) Speech Separation by Humans and Machines , pp. 181-197
    • Wang, D.L.1
  • 35
    • 0032682770 scopus 로고    scopus 로고
    • Separation of speech from interfering sounds based on oscillatory correlation
    • May
    • D. L. Wang and G. J. Brown, “Separation of speech from interfering sounds based on oscillatory correlation,” IEEE Trans. Neural Netw., vol. 10, no. 3, pp. 684–697, May 1999.
    • (1999) IEEE Trans. Neural Netw. , vol.10 , Issue.3 , pp. 684-697
    • Wang, D.L.1    Brown, G.J.2
  • 37
    • 64649103540 scopus 로고    scopus 로고
    • Speech intelligibility in background noise with ideal binary time-frequency masking
    • D. L. Wang, U. Kjems, M. S. Pedersen, J. B. Boldt, and T. Lunner “Speech intelligibility in background noise with ideal binary time-frequency masking,” J. Acoust. Soc. Amer., vol. 125, pp. 2336–2347, 2009.
    • (2009) J. Acoust. Soc. Amer. , vol.125 , pp. 2336-2347
    • Wang, D.L.1    Kjems, U.2    Pedersen, M.S.3    Boldt, J.B.4    Lunner, T.5


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.