메뉴 건너뛰기




Volumn 51, Issue 8, 2009, Pages 657-667

Sequential organization of speech in computational auditory scene analysis

Author keywords

Binary time frequency mask; Computational auditory scene analysis; Sequential organization; Speaker quantization

Indexed keywords

BACKGROUND MODEL; BINARY TIME-FREQUENCY MASK; COMPUTATIONAL AUDITORY SCENE ANALYSIS; GENERIC MODELS; HUMAN LISTENERS; PERFORMANCE LEVEL; PRIOR INFORMATION; SEQUENTIAL GROUPING; SEQUENTIAL ORGANIZATION; SPEAKER MODEL; SPEAKER QUANTIZATION; SPEECH INTERFERENCE; SYSTEMATIC EVALUATION;

EID: 67349134831     PISSN: 01676393     EISSN: None     Source Type: Journal    
DOI: 10.1016/j.specom.2009.02.003     Document Type: Article
Times cited : (15)

References (41)
  • 1
    • 11144316019 scopus 로고    scopus 로고
    • Decoding speech in the presence of other sources
    • Barker J., Cooke M., and Ellis D. Decoding speech in the presence of other sources. Speech Comm. 45 1 (2005) 5-25
    • (2005) Speech Comm. , vol.45 , Issue.1 , pp. 5-25
    • Barker, J.1    Cooke, M.2    Ellis, D.3
  • 2
    • 0036299275 scopus 로고    scopus 로고
    • A Monte Carlo method for score normalization in automatic speaker verification using Kullback-Leibler distances
    • Ben, M., Blouet, R., Bimbot, F., 2002. A Monte Carlo method for score normalization in automatic speaker verification using Kullback-Leibler distances. In: Proc. ICASSP, Vol. I, pp. 689-692.
    • (2002) Proc. ICASSP , vol.1 , pp. 689-692
    • Ben, M.1    Blouet, R.2    Bimbot, F.3
  • 5
    • 0035106984 scopus 로고    scopus 로고
    • Information and energetic masking effects in the perception of two simultaneous talkers
    • Brungart D.S. Information and energetic masking effects in the perception of two simultaneous talkers. J. Acoust. Soc. Amer. 109 (2001) 1101-1109
    • (2001) J. Acoust. Soc. Amer. , vol.109 , pp. 1101-1109
    • Brungart, D.S.1
  • 6
    • 80052339383 scopus 로고
    • Some experiments on the recognition of speech with one and with two ears
    • Cherry E.C. Some experiments on the recognition of speech with one and with two ears. J. Acoust. Soc. Amer. 25 (1953) 975-979
    • (1953) J. Acoust. Soc. Amer. , vol.25 , pp. 975-979
    • Cherry, E.C.1
  • 8
    • 18744401086 scopus 로고    scopus 로고
    • Dynamic compensation of Hmm variants using the feature enhancement uncertainty computed from a parametric model of speech distortion
    • Deng L., Droppo J., and Acero A. Dynamic compensation of Hmm variants using the feature enhancement uncertainty computed from a parametric model of speech distortion. IEEE Trans. Speech Audio Process. 13 (2005) 412-421
    • (2005) IEEE Trans. Speech Audio Process. , vol.13 , pp. 412-421
    • Deng, L.1    Droppo, J.2    Acero, A.3
  • 10
    • 0033872977 scopus 로고    scopus 로고
    • Approaches to speaker detection and tracking in conversational speech
    • Dunn R.B., Reynolds D.A., and Quatieri T.F. Approaches to speaker detection and tracking in conversational speech. Digital Signal Process. 10 (2000) 93-112
    • (2000) Digital Signal Process. , vol.10 , pp. 93-112
    • Dunn, R.B.1    Reynolds, D.A.2    Quatieri, T.F.3
  • 13
    • 67349285580 scopus 로고    scopus 로고
    • Helmholtz, H., 1863. On the Sensation of Tone (A.J. Ellis, Trans.), Second English ed., Dover Publishers, New York.
    • Helmholtz, H., 1863. On the Sensation of Tone (A.J. Ellis, Trans.), Second English ed., Dover Publishers, New York.
  • 14
    • 34547516258 scopus 로고    scopus 로고
    • Approximating the Kullback-Leibler divergence between Gaussian mixture models
    • Hershey, J.R., Olsen, P.A., 2007. Approximating the Kullback-Leibler divergence between Gaussian mixture models. In: Proc. ICASSP, Vol. IV, pp. 317-320.
    • (2007) Proc. ICASSP , vol.4 , pp. 317-320
    • Hershey, J.R.1    Olsen, P.A.2
  • 16
    • 4644265990 scopus 로고    scopus 로고
    • Monaural speech segregation based on pitch tracking and amplitude modulation
    • Hu G., and Wang D.L. Monaural speech segregation based on pitch tracking and amplitude modulation. IEEE Transactions on Neural Networks 15 (2004) 1135-1150
    • (2004) IEEE Transactions on Neural Networks , vol.15 , pp. 1135-1150
    • Hu, G.1    Wang, D.L.2
  • 17
    • 46049084696 scopus 로고    scopus 로고
    • An auditory scene analysis approach to monaural speech separation
    • Hansler E., and Schmidt G. (Eds), Springer, Heidelberg
    • Hu G., and Wang D.L. An auditory scene analysis approach to monaural speech separation. In: Hansler E., and Schmidt G. (Eds). Topics in Acoustic Echo and Noise Control (2006), Springer, Heidelberg 485-515
    • (2006) Topics in Acoustic Echo and Noise Control , pp. 485-515
    • Hu, G.1    Wang, D.L.2
  • 18
    • 49249107353 scopus 로고    scopus 로고
    • Segregation of unvoiced speech from nonspeech interference
    • Hu G., and Wang D.L. Segregation of unvoiced speech from nonspeech interference. J. Acoust. Soc. Amer. 124 (2008) 1306-1319
    • (2008) J. Acoust. Soc. Amer. , vol.124 , pp. 1306-1319
    • Hu, G.1    Wang, D.L.2
  • 21
    • 85009113950 scopus 로고    scopus 로고
    • Speaker model quantization for unsupervised speaker indexing
    • Kwon, S., Narayanan, S., 2004. Speaker model quantization for unsupervised speaker indexing. In: Proc. ICSLP, pp. 1517-1520.
    • (2004) Proc. ICSLP , pp. 1517-1520
    • Kwon, S.1    Narayanan, S.2
  • 22
    • 27644599375 scopus 로고    scopus 로고
    • Unsupervised speaker indexing using generic models
    • Kwon S., and Narayanan S. Unsupervised speaker indexing using generic models. IEEE Trans. Speech Audio Process. 13 5 (2005) 1004-1013
    • (2005) IEEE Trans. Speech Audio Process. , vol.13 , Issue.5 , pp. 1004-1013
    • Kwon, S.1    Narayanan, S.2
  • 25
    • 29044450606 scopus 로고    scopus 로고
    • NIST Speaker Recognition Evaluation Chronicles
    • Przybocki, M.A., Martin, A.F., 2004. NIST Speaker Recognition Evaluation Chronicles. In: Proc. Odyssey 2004.
    • (2004) Proc. Odyssey
    • Przybocki, M.A.1    Martin, A.F.2
  • 26
    • 0025256257 scopus 로고
    • An approach to co-channel talker interference suppression using a sinusoidal model for speech
    • Quatieri T.F., and Danisewicz R.G. An approach to co-channel talker interference suppression using a sinusoidal model for speech. IEEE Trans. Acoust. Speech Signal Process. 38 (1990) 56-69
    • (1990) IEEE Trans. Acoust. Speech Signal Process. , vol.38 , pp. 56-69
    • Quatieri, T.F.1    Danisewicz, R.G.2
  • 27
    • 4644336054 scopus 로고    scopus 로고
    • Reconstruction of missing features for robust speech recognition
    • Raj B., Seltzer M.L., and Stern R.M. Reconstruction of missing features for robust speech recognition. Speech Comm. 43 (2004) 275-296
    • (2004) Speech Comm. , vol.43 , pp. 275-296
    • Raj, B.1    Seltzer, M.L.2    Stern, R.M.3
  • 28
    • 0029355999 scopus 로고
    • Speaker identification and verification using Gaussian mixture speaker models
    • Reynolds D.A. Speaker identification and verification using Gaussian mixture speaker models. Speech Comm. 17 (1995) 91-108
    • (1995) Speech Comm. , vol.17 , pp. 91-108
    • Reynolds, D.A.1
  • 29
  • 33
    • 34547499683 scopus 로고    scopus 로고
    • Incorporating auditory feature uncertainties in robust speaker identification
    • Shao, Y., Srinivasan, S., Wang, D.L., 2007. Incorporating auditory feature uncertainties in robust speaker identification. In: Proc. ICASSP, Vol. IV, pp. 277-280.
    • (2007) Proc. ICASSP , vol.4 , pp. 277-280
    • Shao, Y.1    Srinivasan, S.2    Wang, D.L.3
  • 34
    • 33744996003 scopus 로고    scopus 로고
    • Model-based sequential organization in cochannel speech
    • Shao Y., and Wang D.L. Model-based sequential organization in cochannel speech. IEEE Trans. Audio Speech Lang. Process. 14 1 (2006) 289-298
    • (2006) IEEE Trans. Audio Speech Lang. Process. , vol.14 , Issue.1 , pp. 289-298
    • Shao, Y.1    Wang, D.L.2
  • 35
    • 34047272127 scopus 로고    scopus 로고
    • Average divergence distance as a statistical discrimination measure for hidden Markov models
    • Silva J., and Narayanan S. Average divergence distance as a statistical discrimination measure for hidden Markov models. IEEE Trans. Audio Speech Lang. Process. 14 3 (2006) 890-906
    • (2006) IEEE Trans. Audio Speech Lang. Process. , vol.14 , Issue.3 , pp. 890-906
    • Silva, J.1    Narayanan, S.2
  • 36
    • 56249136428 scopus 로고    scopus 로고
    • Transforming binary uncertainties for robust speech recognition
    • Srinivasan S., and Wang D.L. Transforming binary uncertainties for robust speech recognition. IEEE Trans. Audio Speech Lang. Process. 15 7 (2007) 2130-2140
    • (2007) IEEE Trans. Audio Speech Lang. Process. , vol.15 , Issue.7 , pp. 2130-2140
    • Srinivasan, S.1    Wang, D.L.2
  • 37
    • 0027623210 scopus 로고
    • Assessment for automatic speech recognition: II. Noisex-92: a database and an experiment to study the effect of additive noise on speech recognition systems
    • Varga A., and Steeneken H.J.M. Assessment for automatic speech recognition: II. Noisex-92: a database and an experiment to study the effect of additive noise on speech recognition systems. Speech Comm. 12 3 (1993) 247-251
    • (1993) Speech Comm. , vol.12 , Issue.3 , pp. 247-251
    • Varga, A.1    Steeneken, H.J.M.2
  • 38
    • 3042623400 scopus 로고    scopus 로고
    • On the efficient evaluation of probabilistic similarity functions for image retrieval
    • Vasconcelos N. On the efficient evaluation of probabilistic similarity functions for image retrieval. IEEE Trans. Inform. Theory 50 7 (2004) 1482-1496
    • (2004) IEEE Trans. Inform. Theory , vol.50 , Issue.7 , pp. 1482-1496
    • Vasconcelos, N.1
  • 39
    • 84892233308 scopus 로고    scopus 로고
    • On ideal binary mask as the computational goal of auditory scene analysis
    • Divenyi P. (Ed), Kluwer Academic, Norwell, MA
    • Wang D.L. On ideal binary mask as the computational goal of auditory scene analysis. In: Divenyi P. (Ed). Speech Separation by Humans and Machines (2005), Kluwer Academic, Norwell, MA 181-197
    • (2005) Speech Separation by Humans and Machines , pp. 181-197
    • Wang, D.L.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.