SCOPUS 정보 검색 플랫폼

IEEE Transactions on Audio, Speech and Language Processing

Volumn 21, Issue 4, 2013, Pages 806-815

Binaural detection, localization, and segregation in reverberant environments based on joint pitch and azimuth cues

a The Ohio State University (United States)

Author keywords

Binaural speech segregation; computational auditory scene analysis; multipitch tracking; sound localization; source detection

Indexed keywords

ACROSS TIME; COMPUTATIONAL AUDITORY SCENE ANALYSIS; ESTIMATED STATE; MARKOV MODEL; MULTISOURCES; PERFORMANCE GAIN; PITCH ESTIMATION; REVERBERANT CONDITION; REVERBERANT ENVIRONMENT; SEGREGATION OF SPEECH; SOUND LOCALIZATION; SOURCE DETECTION; SPEECH SEGREGATION; STATE SPACE; TIME FREQUENCY;

ENCODING (SYMBOLS); HIDDEN MARKOV MODELS; REVERBERATION;

CONTINUOUS SPEECH RECOGNITION;

EID: 84872925389 PISSN: 15587916 EISSN: None Source Type: Journal
DOI: 10.1109/TASL.2012.2236316 Document Type: Article

Times cited : (35)

References (43)

1
- 0001835850
- Accurate short-time analysis of the fundamental fre quency and the harmonics-to-noise ratio of a sampled sound
- P. Boersma, "Accurate short-time analysis of the fundamental fre quency and the harmonics-to-noise ratio of a sampled sound," Inst. Phonetic Sci., vol. 17, pp. 97-110, 1993.
- (1993) Inst. Phonetic Sci. , vol.17 , pp. 97-110
- Boersma, P.¹

2
- 0022479342
- Predictors of speech intelligibility in rooms
- J. S. Bradley, "Predictors of speech intelligibility in rooms," J. Acoust. Soc. Amer., vol. 80, pp. 837-845, 1986. (Pubitemid 16054531)
- (1986) Journal of the Acoustical Society of America , vol.80 , Issue.3 , pp. 837-845
- Bradley, J.S.¹

3
- 0003980102
- New York: Springer
- Microphone Arrays: Signal Processing Techniques and Applications, M. Brstein and D. Ward, Eds. New York: Springer, 2001.
- (2001) Microphone Arrays: Signal Processing Techniques and Applications
- Brstein, M.¹ Ward, D.²

4
- 0003684441
- Cambridge MA: MIT Press
- A. S. Bregman, Auditory Scene Analysis. Cambridge, MA: MIT Press, 1990.
- (1990) Auditory Scene Analysis
- Bregman, A.S.¹

5
- 4544262407
- Blind source separation for convolutive mixtures: A unified treatment
- Y. Huang and J. Benesty, Eds. Dordrecht, The Netherlands: Kluwer
- H. Buchner, R. Aichner, and W. Kellermann, "Blind source separation for convolutive mixtures: A unified treatment," in Audio Signal Processing for Next-Generation Multimedia Communications Systems, Y. Huang and J. Benesty, Eds. Dordrecht, The Netherlands: Kluwer, 2004, pp. 255-294.
- (2004) Audio Signal Processing for Next-Generation Multimedia Communications Systems , pp. 255-294
- Buchner, H.¹ Aichner, R.² Kellermann, W.³

6
- 33947676870
- D. R. Campbell, "The ROOMSIM User Guide (v3.3)," 2004 [Online]. Available: http://media.paisley.ac.uk/~campbell/Roomsim/
- (2004) The ROOMSIM User Guide (v3.3)
- Campbell, D.R.¹

7
- 56249137775
- Spatial hearing and perceiving sources
- W. A. Yost, A. N. Popper, and R. R. Fay, Eds. New York: Springer
- C. J. Darwin, "Spatial hearing and perceiving sources," in Auditory Perception of Sound Sources, W. A. Yost, A. N. Popper, and R. R. Fay, Eds. New York: Springer, 2007, pp. 215-232.
- (2007) Auditory Perception of Sound Sources , pp. 215-232
- Darwin, C.J.¹

8
- 77955675017
- Under-determined reverberant audio source separation using a full-rank spatial covariance model
- Se
- N. Q. K. Duong, E. Vincent, and R. Gribonval, "Under-determined reverberant audio source separation using a full-rank spatial covariance model," IEEE Trans. Audio, Speech, Lang. Process., vol. 18, no. 7, pp. 1830-1840, Sep. 2010.
- (2010) IEEE Trans. Audio, Speech, Lang. Process. , vol.18 , Issue.7 , pp. 1830-1840
- Duong, N.Q.K.¹ Vincent, E.² Gribonval, R.³

9
- 84872895918
- Location-based grouping
- D. L. Wang and G. J. Brown, Eds. New York: Wiley/IEEE Press
- A. S. Feng and D. L. Jones, "Location-based grouping," in Computational Auditory Scene Analysis: Principles, Algorithms and Applications, D. L. Wang and G. J. Brown, Eds. New York: Wiley/IEEE Press, 2006, pp. 187-208.
- (2006) Computational Auditory Scene Analysis: Principles, Algorithms and Applications , pp. 187-208
- Feng, A.S.¹ Jones, D.L.²

10
- 84948594425
- An algorithm for linearly constrained adaptive array processing
- Aug.
- O. L. Frost, "An algorithm for linearly constrained adaptive array processing," Proc. IEEE, vol. 60, no. 8, pp. 926-935, Aug. 1972.
- (1972) Proc. IEEE , vol.60 , Issue.8 , pp. 926-935
- Frost, O.L.¹

11
- 0029041417
- HRTF measurements of a KEMAR
- W. G. Gardner and K. D. Martin, "HRTF measurements of a KEMAR," J. Acoust. Soc. Amer., vol. 97, pp. 3907-3908, 1995.
- (1995) J. Acoust. Soc. Amer. , vol.97 , pp. 3907-3908
- Gardner, W.G.¹ Martin, K.D.²

12
- 0003548585
- J. S. Garofolo, L. F. Lamel, W. M. Fisher, J. G. Fiscus, D. S. Pallett, and N. L. Dahlgren, "DARPA TIMIT acoustic phonetic continuous speech corpus," 1993 [Online]. Available: http://www.ldc.upenn.edu/Catalog/ LDC93S1.html
- (1993) DARPA TIMIT Acoustic Phonetic Continuous Speech Corpus
- Garofolo, J.S.¹ Lamel, L.F.² Fisher, W.M.³ Fiscus, J.G.⁴ Pallett, D.S.⁵ Dahlgren, N.L.⁶

13
- 0035528674
- Idiot's Bayes-Not so stupid after all?
- D. J. Hand, "Idiot's Bayes-Not so stupid after all?," Int. Statist. Rev., vol. 69, no. 3, pp. 385-398, 2001.
- (2001) Int. Statist. Rev. , vol.69 , Issue.3 , pp. 385-398
- Hand, D.J.¹

14
- 77950103009
- On optimal multichannel mean-squared error estimators for speech enhancement
- Oct.
- R. C. Hendriks, R. Heusdens, U. Kjems, and J. Jensen, "On optimal multichannel mean-squared error estimators for speech enhancement," IEEE Signal Process. Lett., vol. 16, no. 10, pp. 885-888, Oct. 2009.
- (2009) IEEE Signal Process. Lett. , vol.16 , Issue.10 , pp. 885-888
- Hendriks, R.C.¹ Heusdens, R.² Kjems, U.³ Jensen, J.⁴

15
- 77955700868
- Dynamic precedence effect modeling for source separation in reverberant environments
- Se
- C. Hummersone, R. Mason, and T. Brookes, "Dynamic precedence effect modeling for source separation in reverberant environments," IEEE Trans. Audio, Speech, Lang. Process., vol. 18, no. 7, pp. 1867-1871, Sep. 2010.
- (2010) IEEE Trans. Audio, Speech, Lang. Process. , vol.18 , Issue.7 , pp. 1867-1871
- Hummersone, C.¹ Mason, R.² Brookes, T.³

16
- 84872958394
- Joint DOA and fundamental frequency estimation methods based on 2-d filtering
- J. R. Jensen, M. G. Christensen, and S. H. Jensen, "Joint DOA and fundamental frequency estimation methods based on 2-d filtering," in Proc. EUSIPCO, 2010.
- (2010) Proc. EUSIPCO
- Jensen, J.R.¹ Christensen, M.G.² Jensen, S.H.³

17
- 65249103478
- A supervised learning approach to monaural segregation of reverberant speech
- May
- Z. Jin and D. L. Wang, "A supervised learning approach to monaural segregation of reverberant speech," IEEE Trans. Audio, Speech, Lang. Process., vol. 17, no. 4, pp. 625-638, May 2009.
- (2009) IEEE Trans. Audio, Speech, Lang. Process. , vol.17 , Issue.4 , pp. 625-638
- Jin, Z.¹ Wang, D.L.²

18
- 85008056718
- HMM-based multipitch tracking for noisy and reverberant speech
- Jul.
- Z. Jin and D. L. Wang, "HMM-based multipitch tracking for noisy and reverberant speech," IEEE Trans. Audio, Speech, Lang. Process., vol. 19, no. 5, pp. 1091-1102, Jul. 2011.
- (2011) IEEE Trans. Audio, Speech, Lang. Process. , vol.19 , Issue.5 , pp. 1091-1102
- Jin, Z.¹ Wang, D.L.²

19
- 46749155862
- Joint position pitch tracking for 2-channel audio
- M. Képesi, F. Pernkopf, and M. Wohlmayr, "Joint position pitch tracking for 2-channel audio," in Proc. Int. Workshop Content Based Multimedia Indexing, 2007.
- (2007) Proc. Int. Workshop Content Based Multimedia Indexing
- Képesi, M.¹ Pernkopf, F.² Wohlmayr, M.³

20
- 70349093614
- An algorithm that improves speech intelligibility in noise for normal-hearing listeners
- G. Kim, Y. Lu, Y. Hu, and P. Loizou, "An algorithm that improves speech intelligibility in noise for normal-hearing listeners," J. Acoust. Soc. Amer., vol. 126, no. 3, pp. 1486-1494, 2009.
- (2009) J. Acoust. Soc. Amer. , vol.126 , Issue.3 , pp. 1486-1494
- Kim, G.¹ Lu, Y.² Hu, Y.³ Loizou, P.⁴

21
- 0020497765
- A computational model of binaural localization and separation
- R. F. Lyon, "A computational model of binaural localization and separation," in Proc. ICASSP, 1983, pp. 1148-1151.
- (1983) Proc. ICASSP , pp. 1148-1151
- Lyon, R.F.¹

22
- 84865736704
- Binaural cues for fragment-based speech recognition in reverberant multisource environments
- N. Ma, J. Barker, H. Christensen, and P. Green, "Binaural cues for fragment-based speech recognition in reverberant multisource environments," in Proc. INTERSPEECH, 2011.
- (2011) Proc. INTERSPEECH
- Ma, N.¹ Barker, J.² Christensen, H.³ Green, P.⁴

23
- 85008544097
- Model-based expectation-maximization source separation and localization
- Feb.
- M. I. Mandel, R. J. Weiss, and D. P. W. Ellis, "Model-based expectation-maximization source separation and localization," IEEE Trans. Audio, Speech, Lang. Process., vol. 18, no. 2, pp. 382-394, Feb. 2010.
- (2010) IEEE Trans. Audio, Speech, Lang. Process. , vol.18 , Issue.2 , pp. 382-394
- Mandel, M.I.¹ Weiss, R.J.² Ellis, D.P.W.³

24
- 0029748345
- Localization by harmonic structure and its application to harmonic sound stream segregation
- T. Nakatani, M. Goto, and H. G. Okuno, "Localization by harmonic structure and its application to harmonic sound stream segregation," in Proc. ICASSP, 1996, pp. 653-656.
- (1996) Proc. ICASSP , pp. 653-656
- Nakatani, T.¹ Goto, M.² Okuno, H.G.³

25
- 52149108294
- Combined estimation of spectral envelopes and sound source direction of concurrent voices by multidimensional statistical filtering
- Mar.
- J. Nix and V. Hohmann, "Combined estimation of spectral envelopes and sound source direction of concurrent voices by multidimensional statistical filtering," IEEE Trans. Audio, Speech, Lang. Process., vol. 15, no. 3, pp. 995-1008, Mar. 2007.
- (2007) IEEE Trans. Audio, Speech, Lang. Process. , vol.15 , Issue.3 , pp. 995-1008
- Nix, J.¹ Hohmann, V.²

26
- 3142694930
- Blind separation of speech mixtures via time-frequency masking
- Jul.
- Ö. Yilmaz and S. Rickard, "Blind separation of speech mixtures via time-frequency masking," IEEE Trans. Signal Process., vol. 52, no. 7, pp. 1830-1847, Jul. 2004.
- (2004) IEEE Trans. Signal Process. , vol.52 , Issue.7 , pp. 1830-1847
- Yilmaz, O.¹ Rickard, S.²

27
- 0142056390
- Tech. Rep. Cambridge
- R. D. Patterson, I. Nimmo-Smith, J. Holdsworth, and P. Rice, "An Efficient Auditory Filterbank based on the Gammatone Function, MRC App. Psych. Unit," Tech. Rep. Cambridge, 1988.
- (1988) An Efficient Auditory Filterbank Based on the Gammatone Function, MRC App. Psych. Unit
- Patterson, R.D.¹ Nimmo-Smith, I.² Holdsworth, J.³ Rice, P.⁴

28
- 0142026377
- Speech segregation based on sound localization
- DOI 10.1121/1.1610463
- N. Roman, D. L. Wang, and G. Brown, "Speech segregation based on sound localization," J. Acoust. Soc. Amer, vol. 114, pp. 2236-2252, 2003. (Pubitemid 37266649)
- (2003) Journal of the Acoustical Society of America , vol.114 , Issue.I4 , pp. 2236-2252
- Roman, N.¹ Wang, D.² Brown, G.J.³

29
- 82255167374
- Intelligibility of reverberant noisy speech with ideal binary masking
- N. Roman and J. Woodruff, "Intelligibility of reverberant noisy speech with ideal binary masking," J. Acoust. Soc. Amer, vol. 130, pp. 2153-2161, 2011.
- J. Acoust. Soc. Amer , vol.130 , Issue.2011 , pp. 2153-2161
- Roman, N.¹ Woodruff, J.²

30
- 0035254668
- Sound segregation algorithm for reverberant conditions
- DOI 10.1016/S0167-6393(00)00015-7
- A. Shamsoddini and P. N. Denbigh, "A sound segregation algorithm for reverberant conditions," Speech Commun., vol. 33, pp. 179-196, 2001. (Pubitemid 32034413)
- (2001) Speech Communication , vol.33 , Issue.3 , pp. 179-196
- Shamsoddini, A.¹ Denbigh, P.N.²

31
- 84864578010
- Influences of spatial cues on grouping and understanding sound
- B. G. Shinn-Cunningham, "Influences of spatial cues on grouping and understanding sound," inProc. Forum Acusticum, 2005.
- (2005) Proc. Forum Acusticum
- Shinn-Cunningham, B.G.¹

32
- 0009653561
- Post-filtering techniques
- New York: Springer
- K. U. Simmer, J. Bitzer, and C. Marro, "Post-filtering techniques," in Microphone Arrays: Signal Processing Techniques and Applications. New York: Springer, 2001, pp. 39-60.
- (2001) Microphone Arrays: Signal Processing Techniques and Applications , pp. 39-60
- Simmer, K.U.¹ Bitzer, J.² Marro, C.³

33
- 72949120153
- On optimal frequency-domain multichannel linear filtering for noise reduction
- Feb.
- M. Souden, J. Benesty, and S. Affes, "On optimal frequency-domain multichannel linear filtering for noise reduction," IEEE Trans. Audio, Speech, Lang. Process., vol. 18, no. 2, pp. 260-275, Feb. 2010.
- (2010) IEEE Trans. Audio, Speech, Lang. Process. , vol.18 , Issue.2 , pp. 260-275
- Souden, M.¹ Benesty, J.² Affes, S.³

34
- 84892233308
- On ideal binary masks as the computational goal of auditory scene analysis
- P. Divenyi, Ed. Boston, MA: Kluwer
- D. L. Wang, "On ideal binary masks as the computational goal of auditory scene analysis," in Speech Separation by Humans and Machines, P. Divenyi, Ed. Boston, MA: Kluwer, 2005, pp. 181-197.
- (2005) Speech Separation by Humans and Machines , pp. 181-197
- Wang, D.L.¹

35
- 82255178542
- Hoboken, NJ, Wiley/IEEE Press
- D. L. Wang and G. J. Brown, Eds., Computational Auditory Scene Analysis: Principles, Algorithms, and Applications Hoboken, NJ, Wiley/IEEE Press, 2006.
- (2006) Computational Auditory Scene Analysis: Principles, Algorithms, and Applications
- Wang, D.L.¹ Brown, G.J.²

36
- 64649103540
- Speech intelligibility in background noise with ideal binary time-frequency masking
- D. L. Wang, U. Kjems, M. S. Pedersen, J. B. Boldt, and T. Lunner, "Speech intelligibility in background noise with ideal binary time-frequency masking," J. Acoust. Soc. Amer, vol. 125, pp. 2336-2347, 2009.
- (2009) J. Acoust. Soc. Amer , vol.125 , pp. 2336-2347
- Wang, D.L.¹ Kjems, U.² Pedersen, M.S.³ Boldt, J.B.⁴ Lunner, T.⁵

37
- 79953661463
- Combining localization cues and source model constraints for binaural source separation
- R. Weiss, M. Mandel, and Ellis, "Combining localization cues and source model constraints for binaural source separation," Speech Commun, vol. 53, pp. 606-621, 2011.
- (2011) Speech Commun , vol.53 , pp. 606-621
- Weiss, R.¹ Mandel, M.² Ellis³

38
- 77955697785
- Sequential organization of speech in reverberant environments by integrating monaural grouping and binaural localization
- Se
- J. Woodruff and D. L. Wang, "Sequential organization of speech in reverberant environments by integrating monaural grouping and binaural localization," IEEE Trans. Acoust., Speech, Signal Process., vol. 18, no. 7, pp. 1856-1866, Sep. 2010.
- (2010) IEEE Trans. Acoust., Speech, Signal Process. , vol.18 , Issue.7 , pp. 1856-1866
- Woodruff, J.¹ Wang, D.L.²

39
- 84872299752
- Binaural localization of multiple sources in reverberant and noisy environments
- Jul.
- J. Woodruff and D. L. Wang, "Binaural localization of multiple sources in reverberant and noisy environments," IEEE Trans. Audio, Speech, Lang. Process., vol. 20, no. 5, pp. 1503-1512, Jul. 2012.
- (2012) IEEE Trans. Audio, Speech, Lang. Process. , vol.20 , Issue.5 , pp. 1503-1512
- Woodruff, J.¹ Wang, D.L.²

40
- 0030355093
- A simple architecture for using multiple cues in sound separation
- W. S. Woods, M. Hansen, T. Wittkop, and B. Kollmeier, "A simple architecture for using multiple cues in sound separation," inProc. ICSLP, 1996.
- (1996) Proc. ICSLP
- Woods, W.S.¹ Hansen, M.² Wittkop, T.³ Kollmeier, B.⁴

41
- 40249111625
- Binaural speech separation using recurrent timing neural networks for joint F0-localisation estimation
- DOI 10.1007/978-3-540-78155-4-24, Machine Learning for Multimodal Interaction - 4th International Workshop, MLMI 2007, Revised Selected Papers
- S. N. Wrigley and G. J. Brown, "Binaural speech separation using recurrent timing neural networks for joint F0-localisation estimation," inMachine Learning for Multimodel Interaction. Berlin/Heidelberg, Germany: Springer, 2008, pp. 271-282. (Pubitemid 351333606)
- (2008) Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) , vol.LNCS4892 , pp. 271-282
- Wrigley, S.N.¹ Brown, G.J.²

42
- 0037767686
- A multipitch tracking algorithm for noisy speech
- May
- M. Wu, D. L. Wang, and G. J. Brown, "A multipitch tracking algorithm for noisy speech," IEEE Trans. Speech Audio Process., vol 11 no. 3, pp. 229-241, May 2003.
- (2003) IEEE Trans. Speech Audio Process. , vol.11 , Issue.3 , pp. 229-241
- Wu, M.¹ Wang, D.L.² Brown, G.J.³

43
- 84872953206
- Joint DOA and multi-pitch estimation based on subspace techniques
- J. X. Zhang, M. G. Christensen, S. H. Jensen, and M. Moonen, "Joint DOA and multi-pitch estimation based on subspace techniques," EURASIP J. Adv. Signal Process., vol 2012, pp. 1-11, 2012.
- (2012) EURASIP J. Adv. Signal Process. , vol.2012 , pp. 1-11
- Zhang, J.X.¹ Christensen, M.G.² Jensen, S.H.³ Moonen, M.⁴

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.