SCOPUS 정보 검색 플랫폼

IEEE Transactions on Audio, Speech and Language Processing

Volumn 19, Issue 6, 2011, Pages 1600-1609

Unvoiced Speech Segregation From Nonspeech Interference via CASA and Spectral Subtraction

a The Ohio State University (United States)

Author keywords

Bayesian classification; computational auditory scene analysis (CASA); nonspeech interference; spectral subtraction; unvoiced speech segregation

Indexed keywords

EID: 85008054377 PISSN: 15587916 EISSN: 15587924 Source Type: Journal
DOI: 10.1109/TASL.2010.2093893 Document Type: Article

Times cited : (53)

References (37)

1
- 38849083727
- San Rafael, CA: Morgan & Claypool
- J. B. Allen, Articulation and Intelligibility. San Rafael, CA: Morgan & Claypool, 2005.
- (2005) Articulation and Intelligibility
- Allen, J.B.¹

2
- 0018320733
- Enhancement of speech corrupted by acoustic noise
- M. Berouti, R. Schwartz, and J. Makhoul, “Enhancement of speech corrupted by acoustic noise,” in Proc. IEEE ICASSP, 1979, pp. 208–211.
- (1979) Proc. IEEE ICASSP , pp. 208-211
- Berouti, M.¹ Schwartz, R.² Makhoul, J.³

3
- 85008019627
- December 27, Praat: Doing Phonetics by Computer, ver. 5.0.02 [Online]. Available: http://www.fon. hum.uva.nl/praat
- P. Boersma and D. Weenink, December 27, 2007, Praat: Doing Phonetics by Computer, ver. 5.0.02 [Online]. Available: http://www.fon. hum.uva.nl/praat
- (2007)
- Boersma, P.¹ Weenink, D.²

4
- 0003684441
- Cambridge, MA: MIT Press
- A. Bregman, Auditory Scene Analysis. Cambridge, MA: MIT Press, 1990.
- (1990) Auditory Scene Analysis
- Bregman, A.¹

5
- 0028531926
- Computational auditory scene analysis
- G. J. Brown and M. Cooke “Computational auditory scene analysis,” Comput. Speech Lang., vol. 8, pp. 297–336, 1994.
- (1994) Comput. Speech Lang. , vol.8 , pp. 297-336
- Brown, G.J.¹ Cooke, M.²

6
- 33845354768
- Isolating the energetic component of speech-on-speech masking with ideal time-frequency segregation
- D. S. Brungart, P. S. Chang, B. D. Simpson, and D. L. Wang “Isolating the energetic component of speech-on-speech masking with ideal time-frequency segregation,” J. Acoust. Soc. Amer., vol. 120, pp. 4007–4018, 2006.
- (2006) J. Acoust. Soc. Amer. , vol.120 , pp. 4007-4018
- Brungart, D.S.¹ Chang, P.S.² Simpson, B.D.³ Wang, D.L.⁴

7
- 0003641574
- New York: Springer-Verlag
- C. de Boor, A Practical Guide to Splines. New York: Springer-Verlag, 1978.
- (1978) A Practical Guide to Splines
- de Boor, C.¹

8
- 0004191790
- New York: Thieme Medical Publishers
- H. Dillon, Hearing Aids. New York: Thieme Medical Publishers, 2001.
- (2001) Hearing Aids
- Dillon, H.¹

9
- 0003548585
- [Online]. Available: http://www.ldc.upenn.edu/Catalog/FDC93S1.html
- J. S. Garofolo, L. F. Lamel, W. M. Fisher, J. G. Fiscus, D. S. Pallett, and N. L. Dahlgren, 1993, DARPA TIMIT Acoustic Phonetic Continuous Speech Corpus. [Online]. Available: http://www.ldc.upenn.edu/Catalog/FDC93S1.html
- (1993) DARPA TIMIT Acoustic Phonetic Continuous Speech Corpus
- Garofolo, J.S.¹ Lamel, L.F.² Fisher, W.M.³ Fiscus, J.G.⁴ Pallett, D.S.⁵ Dahlgren, N.L.⁶

10
- 85045165251
- Ph.D. dissertation, Biophys. Program, Ohio State Univ., Columbus
- G. Hu, “Monaural Speech Organization and Segregation,” Ph.D. dissertation, Biophys. Program, Ohio State Univ., Columbus, 2006.
- (2006) Monaural Speech Organization and Segregation
- Hu, G.¹

11
- 85008039975
- 100 Nonspeech Sounds Online. [Online]. Available: http://www.cse.ohio-state.edu/pnl/corpus/HuCorpus.html
- G. Hu, 2006, 100 Nonspeech Sounds Online. [Online]. Available: http://www.cse.ohio-state.edu/pnl/corpus/HuCorpus.html
- (2006)
- Hu, G.¹

12
- 4644265990
- Monaural speech segregation based on pitch tracking and amplitude modulation
- Sep.
- G. Hu and D. L. Wang, “Monaural speech segregation based on pitch tracking and amplitude modulation,” IEEE Trans. Neural Netw., vol. 15, no. 5, pp. 1135–1150, Sep. 2004.
- (2004) IEEE Trans. Neural Netw. , vol.15 , Issue.5 , pp. 1135-1150
- Hu, G.¹ Wang, D.L.²

13
- 49249107353
- Segregation of unvoiced speech from non-speech interference
- G. Hu and D. L. Wang “Segregation of unvoiced speech from non-speech interference,” J. Acoust. Soc. Amer., vol. 124, pp. 1306–1319, 2008.
- (2008) J. Acoust. Soc. Amer. , vol.124 , pp. 1306-1319
- Hu, G.¹ Wang, D.L.²

14
- 77955695149
- A tandem algorithm for pitch estimation and voiced speech segregation
- Nov.
- G. Hu and D. L. Wang “A tandem algorithm for pitch estimation and voiced speech segregation,” IEEE Trans. Audio, Speech, Lang. Process., vol. 18, no. 8, pp. 2067–2079, Nov. 2010.
- (2010) IEEE Trans. Audio, Speech, Lang. Process. , vol.18 , Issue.8 , pp. 2067-2079
- Hu, G.¹ Wang, D.L.²

15
- 70349209415
- Incorporating spectral subtraction and noise type for unvoiced speech segregation
- K. Hu and D. L. Wang, “Incorporating spectral subtraction and noise type for unvoiced speech segregation,” in Proc. IEEE ICASSP, 2009, pp. 4425–4428.
- (2009) Proc. IEEE ICASSP , pp. 4425-4428
- Hu, K.¹ Wang, D.L.²

16
- 35248891610
- A comparative intelligibility study of single-microphone noise reduction algorithms
- Y. Hu and P. C. Loizou “A comparative intelligibility study of single-microphone noise reduction algorithms,” J. Acoust. Soc. Amer., vol. 122, no. 3, pp. 1777–1786, 2007.
- (2007) J. Acoust. Soc. Amer. , vol.122 , Issue.3 , pp. 1777-1786
- Hu, Y.¹ Loizou, P.C.²

17
- 0014568991
- IEEE recommended practice for speech quality measurements
- IEEE “IEEE recommended practice for speech quality measurements,” IEEE Trans. Audio Electroacoust., vol. AE-17, pp. 225–246, 1969.
- (1969) IEEE Trans. Audio Electroacoust. , vol.AE-17 , pp. 225-246

18
- 65249103478
- A supervised learning approach to monaural segregation of reverberant speech
- May
- Z. Jin and D. L. Wang, “A supervised learning approach to monaural segregation of reverberant speech,” IEEE Trans. Audio, Speech, Lang. Process., vol. 17, no. 4, pp. 625–638, May 2009.
- (2009) IEEE Trans. Audio, Speech, Lang. Process. , vol.17 , Issue.4 , pp. 625-638
- Jin, Z.¹ Wang, D.L.²

19
- 0005906625
- Oxford, U.K.: Blackwell
- P. Ladefoged, Vowels and Consonants: An Introduction to the Sounds of Languages. Oxford, U.K.: Blackwell, 2001.
- (2001) Vowels and Consonants: An Introduction to the Sounds of Languages
- Ladefoged, P.¹

20
- 40749125179
- Factors influencing intelligibility of ideal binary-masked speech: Implications for noise reduction
- N. Li and P. C. Loizou “Factors influencing intelligibility of ideal binary-masked speech: Implications for noise reduction,” J. Acoust. Soc. Amer., vol. 123, pp. 1673–1682, 2008.
- (2008) J. Acoust. Soc. Amer. , vol.123 , pp. 1673-1682
- Li, N.¹ Loizou, P.C.²

21
- 40949108726
- Monaural speech separation based on computational auditory scene analysis and objective quality assessment of speech
- Nov.
- P. Li, Y. Guan, B. Xu, and W. Liu “Monaural speech separation based on computational auditory scene analysis and objective quality assessment of speech,” IEEE Trans. Audio, Speech, Lang. Process., vol. 14, no. 6, pp. 2014–2023, Nov. 2006.
- (2006) IEEE Trans. Audio, Speech, Lang. Process. , vol.14 , Issue.6 , pp. 2014-2023
- Li, P.¹ Guan, Y.² Xu, B.³ Liu, W.⁴

22
- 58149196390
- On the optimality of ideal binary time-frequency masks
- Y. Li and D. L. Wang “On the optimality of ideal binary time-frequency masks,” Speech Commun., vol. 51, pp. 230–239, 2009.
- (2009) Speech Commun. , vol.51 , pp. 230-239
- Li, Y.¹ Wang, D.L.²

23
- 34447101009
- Noise estimation using speech/non-speech frame decision and subband spectral tracking
- Z. Lin, R. A. Goubran, and R. M. Dansereau “Noise estimation using speech/non-speech frame decision and subband spectral tracking,” Speech Commun., vol. 49, pp. 542–557, 2007.
- (2007) Speech Commun. , vol.49 , pp. 542-557
- Lin, Z.¹ Goubran, R.A.² Dansereau, R.M.³

24
- 34447100796
- Boca Raton, FL: CRC
- P. C. Loizou, Speech Enhancement: Theory and Practice. Boca Raton, FL: CRC, 2007.
- (2007) Speech Enhancement: Theory and Practice
- Loizou, P.C.¹

25
- 0023944462
- Simulation of auditory-neural transduction: Further studies
- R. Meddis “Simulation of auditory-neural transduction: Further studies,” J. Acoust. Soc. Amer., vol. 83, pp. 1056–1063, 1988.
- (1988) J. Acoust. Soc. Amer. , vol.83 , pp. 1056-1063
- Meddis, R.¹

26
- 0029252699
- On the probabilistic interpretation of neural network classifiers and discriminative training criteria
- Feb.
- H. Ney “On the probabilistic interpretation of neural network classifiers and discriminative training criteria,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 17, no. 2, pp. 107–119, Feb. 1995.
- (1995) IEEE Trans. Pattern Anal. Mach. Intell. , vol.17 , Issue.2 , pp. 107-119
- Ney, H.¹

27
- 0141624530
- An efficient auditory filterbank based on the gammatone function
- Cambridge, U.K., APU Rep. 2341.
- R. D. Patterson, I. Nimmo-Smith, J. Holdsworth, and P. Rice, “An efficient auditory filterbank based on the gammatone function,” Appl. Psychol. Unit, 1988, Cambridge, U.K., APU Rep. 2341.
- (1988) Appl. Psychol. Unit
- Patterson, R.D.¹ Nimmo-Smith, I.² Holdsworth, J.³ Rice, P.⁴

28
- 48849091396
- Single-channel speech separation using soft masking filtering
- Nov.
- M. H. Radfar and R. M. Dansereau “Single-channel speech separation using soft masking filtering,” IEEE Trans. Audio, Speech, Lang. Process., vol. 15, no. 8, pp. 2299–2310, Nov. 2007.
- (2007) IEEE Trans. Audio, Speech, Lang. Process. , vol.15 , Issue.8 , pp. 2299-2310
- Radfar, M.H.¹ Dansereau, R.M.²

29
- 33845940172
- A maximum likelihood estimation of vocal-tract-related filter characteristics for single channel speech separation
- 10.1155/2007/84186, Article ID 84186, 15 pages
- M. H. Radfar, R. M. Dansereau, and A. Sayadiyan, “A maximum likelihood estimation of vocal-tract-related filter characteristics for single channel speech separation,” EURASIP J. Audio, Speech, Music Process., vol. 2007, 2007, 10.1155/2007/84186, Article ID 84186, 15 pages.
- (2007) EURASIP J. Audio, Speech, Music Process. , vol.2007
- Radfar, M.H.¹ Dansereau, R.M.² Sayadiyan, A.³

30
- 46049084086
- Ph.D. dissertation, Dept. of Comput. Sci. and Eng., Ohio State Univ., Columbus
- Y. Shao, “Sequential Organization in Computational Auditory Scene Analysis,” Ph.D. dissertation, Dept. of Comput. Sci. and Eng., Ohio State Univ., Columbus, 2007.
- (2007) Sequential Organization in Computational Auditory Scene Analysis
- Shao, Y.¹

31
- 51449109652
- Codebook-based Bayesian speech enhancement for nonstationary environments
- Feb.
- S. Srinivasan, J. Samuelsson, and W. B. Kleijn “Codebook-based Bayesian speech enhancement for nonstationary environments,” IEEE Trans. Audio, Speech, Lang. Process., vol. 15, no. 2, pp. 441–452, Feb. 2007.
- (2007) IEEE Trans. Audio, Speech, Lang. Process. , vol.15 , Issue.2 , pp. 441-452
- Srinivasan, S.¹ Samuelsson, J.² Kleijn, W.B.³

32
- 0004129646
- Cambridge, MA: MIT Press
- K. N. Stevens, Acoustic Phonetics. Cambridge, MA: MIT Press, 1998.
- (1998) Acoustic Phonetics
- Stevens, K.N.¹

33
- 70349448618
- An algorithm for speech segregation of co-channel speech
- S. Vishnubhotla and C. Y. Espy-Wilson, “An algorithm for speech segregation of co-channel speech,” in Proc. IEEE ICASSP, 2009, pp. 109–112.
- (2009) Proc. IEEE ICASSP , pp. 109-112
- Vishnubhotla, S.¹ Espy-Wilson, C.Y.²

34
- 84892233308
- On ideal binary mask as the computational goal of auditory scene analysis
- P. Divenyi, Ed. Norwell, MA: Kluwer
- D. L. Wang, “On ideal binary mask as the computational goal of auditory scene analysis,” in Speech Separation by Humans and Machines, P. Divenyi, Ed. Norwell, MA: Kluwer, 2005, pp. 181–197.
- (2005) Speech Separation by Humans and Machines , pp. 181-197
- Wang, D.L.¹

35
- 0032682770
- Separation of speech from interfering sounds based on oscillatory correlation
- May
- D. L. Wang and G. J. Brown, “Separation of speech from interfering sounds based on oscillatory correlation,” IEEE Trans. Neural Netw., vol. 10, no. 3, pp. 684–697, May 1999.
- (1999) IEEE Trans. Neural Netw. , vol.10 , Issue.3 , pp. 684-697
- Wang, D.L.¹ Brown, G.J.²

36
- 82255178542
- D. L. Wang and G. J. Brown, Eds. Hoboken, NJ: Wiley-IEEE Press
- Computational Auditory Scene Analysis: Principles, Algorithms and Applications, D. L. Wang and G. J. Brown, Eds. Hoboken, NJ: Wiley-IEEE Press, 2006.
- (2006) Computational Auditory Scene Analysis: Principles, Algorithms and Applications

37
- 64649103540
- Speech intelligibility in background noise with ideal binary time-frequency masking
- D. L. Wang, U. Kjems, M. S. Pedersen, J. B. Boldt, and T. Lunner “Speech intelligibility in background noise with ideal binary time-frequency masking,” J. Acoust. Soc. Amer., vol. 125, pp. 2336–2347, 2009.
- (2009) J. Acoust. Soc. Amer. , vol.125 , pp. 2336-2347
- Wang, D.L.¹ Kjems, U.² Pedersen, M.S.³ Boldt, J.B.⁴ Lunner, T.⁵

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.