메뉴 건너뛰기




Volumn 21, Issue 4, 2013, Pages 806-815

Binaural detection, localization, and segregation in reverberant environments based on joint pitch and azimuth cues

Author keywords

Binaural speech segregation; computational auditory scene analysis; multipitch tracking; sound localization; source detection

Indexed keywords

ACROSS TIME; COMPUTATIONAL AUDITORY SCENE ANALYSIS; ESTIMATED STATE; MARKOV MODEL; MULTISOURCES; PERFORMANCE GAIN; PITCH ESTIMATION; REVERBERANT CONDITION; REVERBERANT ENVIRONMENT; SEGREGATION OF SPEECH; SOUND LOCALIZATION; SOURCE DETECTION; SPEECH SEGREGATION; STATE SPACE; TIME FREQUENCY;

EID: 84872925389     PISSN: 15587916     EISSN: None     Source Type: Journal    
DOI: 10.1109/TASL.2012.2236316     Document Type: Article
Times cited : (35)

References (43)
  • 1
    • 0001835850 scopus 로고
    • Accurate short-time analysis of the fundamental fre quency and the harmonics-to-noise ratio of a sampled sound
    • P. Boersma, "Accurate short-time analysis of the fundamental fre quency and the harmonics-to-noise ratio of a sampled sound," Inst. Phonetic Sci., vol. 17, pp. 97-110, 1993.
    • (1993) Inst. Phonetic Sci. , vol.17 , pp. 97-110
    • Boersma, P.1
  • 2
  • 7
    • 56249137775 scopus 로고    scopus 로고
    • Spatial hearing and perceiving sources
    • W. A. Yost, A. N. Popper, and R. R. Fay, Eds. New York: Springer
    • C. J. Darwin, "Spatial hearing and perceiving sources," in Auditory Perception of Sound Sources, W. A. Yost, A. N. Popper, and R. R. Fay, Eds. New York: Springer, 2007, pp. 215-232.
    • (2007) Auditory Perception of Sound Sources , pp. 215-232
    • Darwin, C.J.1
  • 8
    • 77955675017 scopus 로고    scopus 로고
    • Under-determined reverberant audio source separation using a full-rank spatial covariance model
    • Se
    • N. Q. K. Duong, E. Vincent, and R. Gribonval, "Under-determined reverberant audio source separation using a full-rank spatial covariance model," IEEE Trans. Audio, Speech, Lang. Process., vol. 18, no. 7, pp. 1830-1840, Sep. 2010.
    • (2010) IEEE Trans. Audio, Speech, Lang. Process. , vol.18 , Issue.7 , pp. 1830-1840
    • Duong, N.Q.K.1    Vincent, E.2    Gribonval, R.3
  • 10
    • 84948594425 scopus 로고
    • An algorithm for linearly constrained adaptive array processing
    • Aug.
    • O. L. Frost, "An algorithm for linearly constrained adaptive array processing," Proc. IEEE, vol. 60, no. 8, pp. 926-935, Aug. 1972.
    • (1972) Proc. IEEE , vol.60 , Issue.8 , pp. 926-935
    • Frost, O.L.1
  • 11
    • 0029041417 scopus 로고
    • HRTF measurements of a KEMAR
    • W. G. Gardner and K. D. Martin, "HRTF measurements of a KEMAR," J. Acoust. Soc. Amer., vol. 97, pp. 3907-3908, 1995.
    • (1995) J. Acoust. Soc. Amer. , vol.97 , pp. 3907-3908
    • Gardner, W.G.1    Martin, K.D.2
  • 13
    • 0035528674 scopus 로고    scopus 로고
    • Idiot's Bayes-Not so stupid after all?
    • D. J. Hand, "Idiot's Bayes-Not so stupid after all?," Int. Statist. Rev., vol. 69, no. 3, pp. 385-398, 2001.
    • (2001) Int. Statist. Rev. , vol.69 , Issue.3 , pp. 385-398
    • Hand, D.J.1
  • 14
    • 77950103009 scopus 로고    scopus 로고
    • On optimal multichannel mean-squared error estimators for speech enhancement
    • Oct.
    • R. C. Hendriks, R. Heusdens, U. Kjems, and J. Jensen, "On optimal multichannel mean-squared error estimators for speech enhancement," IEEE Signal Process. Lett., vol. 16, no. 10, pp. 885-888, Oct. 2009.
    • (2009) IEEE Signal Process. Lett. , vol.16 , Issue.10 , pp. 885-888
    • Hendriks, R.C.1    Heusdens, R.2    Kjems, U.3    Jensen, J.4
  • 15
    • 77955700868 scopus 로고    scopus 로고
    • Dynamic precedence effect modeling for source separation in reverberant environments
    • Se
    • C. Hummersone, R. Mason, and T. Brookes, "Dynamic precedence effect modeling for source separation in reverberant environments," IEEE Trans. Audio, Speech, Lang. Process., vol. 18, no. 7, pp. 1867-1871, Sep. 2010.
    • (2010) IEEE Trans. Audio, Speech, Lang. Process. , vol.18 , Issue.7 , pp. 1867-1871
    • Hummersone, C.1    Mason, R.2    Brookes, T.3
  • 16
    • 84872958394 scopus 로고    scopus 로고
    • Joint DOA and fundamental frequency estimation methods based on 2-d filtering
    • J. R. Jensen, M. G. Christensen, and S. H. Jensen, "Joint DOA and fundamental frequency estimation methods based on 2-d filtering," in Proc. EUSIPCO, 2010.
    • (2010) Proc. EUSIPCO
    • Jensen, J.R.1    Christensen, M.G.2    Jensen, S.H.3
  • 17
    • 65249103478 scopus 로고    scopus 로고
    • A supervised learning approach to monaural segregation of reverberant speech
    • May
    • Z. Jin and D. L. Wang, "A supervised learning approach to monaural segregation of reverberant speech," IEEE Trans. Audio, Speech, Lang. Process., vol. 17, no. 4, pp. 625-638, May 2009.
    • (2009) IEEE Trans. Audio, Speech, Lang. Process. , vol.17 , Issue.4 , pp. 625-638
    • Jin, Z.1    Wang, D.L.2
  • 18
    • 85008056718 scopus 로고    scopus 로고
    • HMM-based multipitch tracking for noisy and reverberant speech
    • Jul.
    • Z. Jin and D. L. Wang, "HMM-based multipitch tracking for noisy and reverberant speech," IEEE Trans. Audio, Speech, Lang. Process., vol. 19, no. 5, pp. 1091-1102, Jul. 2011.
    • (2011) IEEE Trans. Audio, Speech, Lang. Process. , vol.19 , Issue.5 , pp. 1091-1102
    • Jin, Z.1    Wang, D.L.2
  • 20
    • 70349093614 scopus 로고    scopus 로고
    • An algorithm that improves speech intelligibility in noise for normal-hearing listeners
    • G. Kim, Y. Lu, Y. Hu, and P. Loizou, "An algorithm that improves speech intelligibility in noise for normal-hearing listeners," J. Acoust. Soc. Amer., vol. 126, no. 3, pp. 1486-1494, 2009.
    • (2009) J. Acoust. Soc. Amer. , vol.126 , Issue.3 , pp. 1486-1494
    • Kim, G.1    Lu, Y.2    Hu, Y.3    Loizou, P.4
  • 21
    • 0020497765 scopus 로고
    • A computational model of binaural localization and separation
    • R. F. Lyon, "A computational model of binaural localization and separation," in Proc. ICASSP, 1983, pp. 1148-1151.
    • (1983) Proc. ICASSP , pp. 1148-1151
    • Lyon, R.F.1
  • 22
    • 84865736704 scopus 로고    scopus 로고
    • Binaural cues for fragment-based speech recognition in reverberant multisource environments
    • N. Ma, J. Barker, H. Christensen, and P. Green, "Binaural cues for fragment-based speech recognition in reverberant multisource environments," in Proc. INTERSPEECH, 2011.
    • (2011) Proc. INTERSPEECH
    • Ma, N.1    Barker, J.2    Christensen, H.3    Green, P.4
  • 23
    • 85008544097 scopus 로고    scopus 로고
    • Model-based expectation-maximization source separation and localization
    • Feb.
    • M. I. Mandel, R. J. Weiss, and D. P. W. Ellis, "Model-based expectation-maximization source separation and localization," IEEE Trans. Audio, Speech, Lang. Process., vol. 18, no. 2, pp. 382-394, Feb. 2010.
    • (2010) IEEE Trans. Audio, Speech, Lang. Process. , vol.18 , Issue.2 , pp. 382-394
    • Mandel, M.I.1    Weiss, R.J.2    Ellis, D.P.W.3
  • 24
    • 0029748345 scopus 로고    scopus 로고
    • Localization by harmonic structure and its application to harmonic sound stream segregation
    • T. Nakatani, M. Goto, and H. G. Okuno, "Localization by harmonic structure and its application to harmonic sound stream segregation," in Proc. ICASSP, 1996, pp. 653-656.
    • (1996) Proc. ICASSP , pp. 653-656
    • Nakatani, T.1    Goto, M.2    Okuno, H.G.3
  • 25
    • 52149108294 scopus 로고    scopus 로고
    • Combined estimation of spectral envelopes and sound source direction of concurrent voices by multidimensional statistical filtering
    • Mar.
    • J. Nix and V. Hohmann, "Combined estimation of spectral envelopes and sound source direction of concurrent voices by multidimensional statistical filtering," IEEE Trans. Audio, Speech, Lang. Process., vol. 15, no. 3, pp. 995-1008, Mar. 2007.
    • (2007) IEEE Trans. Audio, Speech, Lang. Process. , vol.15 , Issue.3 , pp. 995-1008
    • Nix, J.1    Hohmann, V.2
  • 26
    • 3142694930 scopus 로고    scopus 로고
    • Blind separation of speech mixtures via time-frequency masking
    • Jul.
    • Ö. Yilmaz and S. Rickard, "Blind separation of speech mixtures via time-frequency masking," IEEE Trans. Signal Process., vol. 52, no. 7, pp. 1830-1847, Jul. 2004.
    • (2004) IEEE Trans. Signal Process. , vol.52 , Issue.7 , pp. 1830-1847
    • Yilmaz, O.1    Rickard, S.2
  • 29
    • 82255167374 scopus 로고    scopus 로고
    • Intelligibility of reverberant noisy speech with ideal binary masking
    • N. Roman and J. Woodruff, "Intelligibility of reverberant noisy speech with ideal binary masking," J. Acoust. Soc. Amer, vol. 130, pp. 2153-2161, 2011.
    • J. Acoust. Soc. Amer , vol.130 , Issue.2011 , pp. 2153-2161
    • Roman, N.1    Woodruff, J.2
  • 30
    • 0035254668 scopus 로고    scopus 로고
    • Sound segregation algorithm for reverberant conditions
    • DOI 10.1016/S0167-6393(00)00015-7
    • A. Shamsoddini and P. N. Denbigh, "A sound segregation algorithm for reverberant conditions," Speech Commun., vol. 33, pp. 179-196, 2001. (Pubitemid 32034413)
    • (2001) Speech Communication , vol.33 , Issue.3 , pp. 179-196
    • Shamsoddini, A.1    Denbigh, P.N.2
  • 31
    • 84864578010 scopus 로고    scopus 로고
    • Influences of spatial cues on grouping and understanding sound
    • B. G. Shinn-Cunningham, "Influences of spatial cues on grouping and understanding sound," inProc. Forum Acusticum, 2005.
    • (2005) Proc. Forum Acusticum
    • Shinn-Cunningham, B.G.1
  • 33
    • 72949120153 scopus 로고    scopus 로고
    • On optimal frequency-domain multichannel linear filtering for noise reduction
    • Feb.
    • M. Souden, J. Benesty, and S. Affes, "On optimal frequency-domain multichannel linear filtering for noise reduction," IEEE Trans. Audio, Speech, Lang. Process., vol. 18, no. 2, pp. 260-275, Feb. 2010.
    • (2010) IEEE Trans. Audio, Speech, Lang. Process. , vol.18 , Issue.2 , pp. 260-275
    • Souden, M.1    Benesty, J.2    Affes, S.3
  • 34
    • 84892233308 scopus 로고    scopus 로고
    • On ideal binary masks as the computational goal of auditory scene analysis
    • P. Divenyi, Ed. Boston, MA: Kluwer
    • D. L. Wang, "On ideal binary masks as the computational goal of auditory scene analysis," in Speech Separation by Humans and Machines, P. Divenyi, Ed. Boston, MA: Kluwer, 2005, pp. 181-197.
    • (2005) Speech Separation by Humans and Machines , pp. 181-197
    • Wang, D.L.1
  • 36
    • 64649103540 scopus 로고    scopus 로고
    • Speech intelligibility in background noise with ideal binary time-frequency masking
    • D. L. Wang, U. Kjems, M. S. Pedersen, J. B. Boldt, and T. Lunner, "Speech intelligibility in background noise with ideal binary time-frequency masking," J. Acoust. Soc. Amer, vol. 125, pp. 2336-2347, 2009.
    • (2009) J. Acoust. Soc. Amer , vol.125 , pp. 2336-2347
    • Wang, D.L.1    Kjems, U.2    Pedersen, M.S.3    Boldt, J.B.4    Lunner, T.5
  • 37
    • 79953661463 scopus 로고    scopus 로고
    • Combining localization cues and source model constraints for binaural source separation
    • R. Weiss, M. Mandel, and Ellis, "Combining localization cues and source model constraints for binaural source separation," Speech Commun, vol. 53, pp. 606-621, 2011.
    • (2011) Speech Commun , vol.53 , pp. 606-621
    • Weiss, R.1    Mandel, M.2    Ellis3
  • 38
    • 77955697785 scopus 로고    scopus 로고
    • Sequential organization of speech in reverberant environments by integrating monaural grouping and binaural localization
    • Se
    • J. Woodruff and D. L. Wang, "Sequential organization of speech in reverberant environments by integrating monaural grouping and binaural localization," IEEE Trans. Acoust., Speech, Signal Process., vol. 18, no. 7, pp. 1856-1866, Sep. 2010.
    • (2010) IEEE Trans. Acoust., Speech, Signal Process. , vol.18 , Issue.7 , pp. 1856-1866
    • Woodruff, J.1    Wang, D.L.2
  • 39
    • 84872299752 scopus 로고    scopus 로고
    • Binaural localization of multiple sources in reverberant and noisy environments
    • Jul.
    • J. Woodruff and D. L. Wang, "Binaural localization of multiple sources in reverberant and noisy environments," IEEE Trans. Audio, Speech, Lang. Process., vol. 20, no. 5, pp. 1503-1512, Jul. 2012.
    • (2012) IEEE Trans. Audio, Speech, Lang. Process. , vol.20 , Issue.5 , pp. 1503-1512
    • Woodruff, J.1    Wang, D.L.2
  • 40
    • 0030355093 scopus 로고    scopus 로고
    • A simple architecture for using multiple cues in sound separation
    • W. S. Woods, M. Hansen, T. Wittkop, and B. Kollmeier, "A simple architecture for using multiple cues in sound separation," inProc. ICSLP, 1996.
    • (1996) Proc. ICSLP
    • Woods, W.S.1    Hansen, M.2    Wittkop, T.3    Kollmeier, B.4
  • 42
    • 0037767686 scopus 로고    scopus 로고
    • A multipitch tracking algorithm for noisy speech
    • May
    • M. Wu, D. L. Wang, and G. J. Brown, "A multipitch tracking algorithm for noisy speech," IEEE Trans. Speech Audio Process., vol 11 no. 3, pp. 229-241, May 2003.
    • (2003) IEEE Trans. Speech Audio Process. , vol.11 , Issue.3 , pp. 229-241
    • Wu, M.1    Wang, D.L.2    Brown, G.J.3


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.