SCOPUS 정보 검색 플랫폼

Speech Communication

Volumn 51, Issue 8, 2009, Pages 657-667

Sequential organization of speech in computational auditory scene analysis

(2) Shao, Yang a Wang, DeLiang a,b

a Ohio State University (United States)

b OHIO STATE UNIVERSITY (United States)

Author keywords

Binary time frequency mask; Computational auditory scene analysis; Sequential organization; Speaker quantization

Indexed keywords

BACKGROUND MODEL; BINARY TIME-FREQUENCY MASK; COMPUTATIONAL AUDITORY SCENE ANALYSIS; GENERIC MODELS; HUMAN LISTENERS; PERFORMANCE LEVEL; PRIOR INFORMATION; SEQUENTIAL GROUPING; SEQUENTIAL ORGANIZATION; SPEAKER MODEL; SPEAKER QUANTIZATION; SPEECH INTERFERENCE; SYSTEMATIC EVALUATION;

PATIENT REHABILITATION;

EID: 67349134831 PISSN: 01676393 EISSN: None Source Type: Journal
DOI: 10.1016/j.specom.2009.02.003 Document Type: Article

Times cited : (15)

References (41)

1
- 11144316019
- Decoding speech in the presence of other sources
- Barker J., Cooke M., and Ellis D. Decoding speech in the presence of other sources. Speech Comm. 45 1 (2005) 5-25
- (2005) Speech Comm. , vol.45 , Issue.1 , pp. 5-25
- Barker, J.¹ Cooke, M.² Ellis, D.³

2
- 0036299275
- A Monte Carlo method for score normalization in automatic speaker verification using Kullback-Leibler distances
- Ben, M., Blouet, R., Bimbot, F., 2002. A Monte Carlo method for score normalization in automatic speaker verification using Kullback-Leibler distances. In: Proc. ICASSP, Vol. I, pp. 689-692.
- (2002) Proc. ICASSP , vol.1 , pp. 689-692
- Ben, M.¹ Blouet, R.² Bimbot, F.³

3
- 2942594475
- A tutorial on text-independent speaker verification
- Bimbot F., Bonastre J., Fredouille C., Gravier G., Magrin-Chagnolleau I., Meignier S., Merlin T., Ortega-Garcia J., Petrovska-Delacretaz D., and Reynolds D.A. A tutorial on text-independent speaker verification. EURASIP J. Appl. Signal Process. 4 (2004) 430-451
- (2004) EURASIP J. Appl. Signal Process. , Issue.4 , pp. 430-451
- Bimbot, F.¹ Bonastre, J.² Fredouille, C.³ Gravier, G.⁴ Magrin-Chagnolleau, I.⁵ Meignier, S.⁶ Merlin, T.⁷ Ortega-Garcia, J.⁸ Petrovska-Delacretaz, D.⁹ Reynolds, D.A.¹⁰

4
- 0003684441
- MIT Press, Cambridge, MA
- Bregman A.S. Auditory Scene Analysis (1990), MIT Press, Cambridge, MA
- (1990) Auditory Scene Analysis
- Bregman, A.S.¹

5
- 0035106984
- Information and energetic masking effects in the perception of two simultaneous talkers
- Brungart D.S. Information and energetic masking effects in the perception of two simultaneous talkers. J. Acoust. Soc. Amer. 109 (2001) 1101-1109
- (2001) J. Acoust. Soc. Amer. , vol.109 , pp. 1101-1109
- Brungart, D.S.¹

6
- 80052339383
- Some experiments on the recognition of speech with one and with two ears
- Cherry E.C. Some experiments on the recognition of speech with one and with two ears. J. Acoust. Soc. Amer. 25 (1953) 975-979
- (1953) J. Acoust. Soc. Amer. , vol.25 , pp. 975-979
- Cherry, E.C.¹

7
- 34547539772
- Available at
- Cooke, M.P., Lee, T.W., 2006. Speech separation and recognition competition. Available at .
- (2006) Speech separation and recognition competition
- Cooke, M.P.¹ Lee, T.W.²

8
- 18744401086
- Dynamic compensation of Hmm variants using the feature enhancement uncertainty computed from a parametric model of speech distortion
- Deng L., Droppo J., and Acero A. Dynamic compensation of Hmm variants using the feature enhancement uncertainty computed from a parametric model of speech distortion. IEEE Trans. Speech Audio Process. 13 (2005) 412-421
- (2005) IEEE Trans. Speech Audio Process. , vol.13 , pp. 412-421
- Deng, L.¹ Droppo, J.² Acero, A.³

9
- 0003922190
- Wiley, New York
- Duda R.O., Hart P.E., and Stork D.G. Pattern Classification. second ed. (2001), Wiley, New York
- (2001) Pattern Classification. second ed.
- Duda, R.O.¹ Hart, P.E.² Stork, D.G.³

10
- 0033872977
- Approaches to speaker detection and tracking in conversational speech
- Dunn R.B., Reynolds D.A., and Quatieri T.F. Approaches to speaker detection and tracking in conversational speech. Digital Signal Process. 10 (2000) 93-112
- (2000) Digital Signal Process. , vol.10 , pp. 93-112
- Dunn, R.B.¹ Reynolds, D.A.² Quatieri, T.F.³

11
- 84873856136
- Model-based scene analysis
- Wang D.L., and Brown G.J. (Eds), Wiley-IEEE Press, Hoboken, NJ
- Ellis D.P.W. Model-based scene analysis. In: Wang D.L., and Brown G.J. (Eds). Computational Auditory Scene Analysis: Principles, Algorithms, and Applications (2006), Wiley-IEEE Press, Hoboken, NJ 115-146
- (2006) Computational Auditory Scene Analysis: Principles, Algorithms, and Applications , pp. 115-146
- Ellis, D.P.W.¹

12
- 0004072715
- Marcel Dekker, New York
- Furui S. Digital Speech Processing, Synthesis, and Recognition (2001), Marcel Dekker, New York
- (2001) Digital Speech Processing, Synthesis, and Recognition
- Furui, S.¹

13
- 67349285580
- Helmholtz, H., 1863. On the Sensation of Tone (A.J. Ellis, Trans.), Second English ed., Dover Publishers, New York.
- Helmholtz, H., 1863. On the Sensation of Tone (A.J. Ellis, Trans.), Second English ed., Dover Publishers, New York.

14
- 34547516258
- Approximating the Kullback-Leibler divergence between Gaussian mixture models
- Hershey, J.R., Olsen, P.A., 2007. Approximating the Kullback-Leibler divergence between Gaussian mixture models. In: Proc. ICASSP, Vol. IV, pp. 317-320.
- (2007) Proc. ICASSP , vol.4 , pp. 317-320
- Hershey, J.R.¹ Olsen, P.A.²

15
- 85045165251
- Ph.D. Dissertation, The Ohio State University
- Hu, G., 2006. Monaural speech organization and segregation. Ph.D. Dissertation, The Ohio State University.
- (2006) Monaural speech organization and segregation
- Hu, G.¹

16
- 4644265990
- Monaural speech segregation based on pitch tracking and amplitude modulation
- Hu G., and Wang D.L. Monaural speech segregation based on pitch tracking and amplitude modulation. IEEE Transactions on Neural Networks 15 (2004) 1135-1150
- (2004) IEEE Transactions on Neural Networks , vol.15 , pp. 1135-1150
- Hu, G.¹ Wang, D.L.²

17
- 46049084696
- An auditory scene analysis approach to monaural speech separation
- Hansler E., and Schmidt G. (Eds), Springer, Heidelberg
- Hu G., and Wang D.L. An auditory scene analysis approach to monaural speech separation. In: Hansler E., and Schmidt G. (Eds). Topics in Acoustic Echo and Noise Control (2006), Springer, Heidelberg 485-515
- (2006) Topics in Acoustic Echo and Noise Control , pp. 485-515
- Hu, G.¹ Wang, D.L.²

18
- 49249107353
- Segregation of unvoiced speech from nonspeech interference
- Hu G., and Wang D.L. Segregation of unvoiced speech from nonspeech interference. J. Acoust. Soc. Amer. 124 (2008) 1306-1319
- (2008) J. Acoust. Soc. Amer. , vol.124 , pp. 1306-1319
- Hu, G.¹ Wang, D.L.²

19
- 0004056285
- Prentice Hall, Upper Saddle River
- Huang X., Acero A., and Hon H. Spoken Language Processing (2001), Prentice Hall, Upper Saddle River
- (2001) Spoken Language Processing
- Huang, X.¹ Acero, A.² Hon, H.³

20
- 0004257990
- Dover, New York
- Kullback S. Information Theory and Statistics (1968), Dover, New York
- (1968) Information Theory and Statistics
- Kullback, S.¹

21
- 85009113950
- Speaker model quantization for unsupervised speaker indexing
- Kwon, S., Narayanan, S., 2004. Speaker model quantization for unsupervised speaker indexing. In: Proc. ICSLP, pp. 1517-1520.
- (2004) Proc. ICSLP , pp. 1517-1520
- Kwon, S.¹ Narayanan, S.²

22
- 27644599375
- Unsupervised speaker indexing using generic models
- Kwon S., and Narayanan S. Unsupervised speaker indexing using generic models. IEEE Trans. Speech Audio Process. 13 5 (2005) 1004-1013
- (2005) IEEE Trans. Speech Audio Process. , vol.13 , Issue.5 , pp. 1004-1013
- Kwon, S.¹ Narayanan, S.²

23
- 0003789815
- Academic, San Diego
- Moore B.C.J. An Introduction to the Psychology of Hearing. fifth ed. (2003), Academic, San Diego
- (2003) An Introduction to the Psychology of Hearing. fifth ed.
- Moore, B.C.J.¹

24
- 0142056390
- APU Report 2341, Cambridge, UK, MRC Applied Psychology Unit
- Patterson, R.D., Nimmo-Smith I., Holdsworth J., Rice P., 1988. An efficient auditory filterbank based on the gammatone function. APU Report 2341, Cambridge, UK, MRC Applied Psychology Unit.
- (1988) An efficient auditory filterbank based on the gammatone function
- Patterson, R.D.¹ Nimmo-Smith, I.² Holdsworth, J.³ Rice, P.⁴

25
- 29044450606
- NIST Speaker Recognition Evaluation Chronicles
- Przybocki, M.A., Martin, A.F., 2004. NIST Speaker Recognition Evaluation Chronicles. In: Proc. Odyssey 2004.
- (2004) Proc. Odyssey
- Przybocki, M.A.¹ Martin, A.F.²

26
- 0025256257
- An approach to co-channel talker interference suppression using a sinusoidal model for speech
- Quatieri T.F., and Danisewicz R.G. An approach to co-channel talker interference suppression using a sinusoidal model for speech. IEEE Trans. Acoust. Speech Signal Process. 38 (1990) 56-69
- (1990) IEEE Trans. Acoust. Speech Signal Process. , vol.38 , pp. 56-69
- Quatieri, T.F.¹ Danisewicz, R.G.²

27
- 4644336054
- Reconstruction of missing features for robust speech recognition
- Raj B., Seltzer M.L., and Stern R.M. Reconstruction of missing features for robust speech recognition. Speech Comm. 43 (2004) 275-296
- (2004) Speech Comm. , vol.43 , pp. 275-296
- Raj, B.¹ Seltzer, M.L.² Stern, R.M.³

28
- 0029355999
- Speaker identification and verification using Gaussian mixture speaker models
- Reynolds D.A. Speaker identification and verification using Gaussian mixture speaker models. Speech Comm. 17 (1995) 91-108
- (1995) Speech Comm. , vol.17 , pp. 91-108
- Reynolds, D.A.¹

29
- 0033884858
- Speaker verification using adapted Gaussian mixture models
- Reynolds D.A., Quatieri T.F., and Dunn R.B. Speaker verification using adapted Gaussian mixture models. Digital Signal Process. 10 (2000) 19-41
- (2000) Digital Signal Process. , vol.10 , pp. 19-41
- Reynolds, D.A.¹ Quatieri, T.F.² Dunn, R.B.³

30
- 0003516768
- Duxbury Press, Belmont, CA
- Rice J.A. Mathematical Statistics and Data Analysis (1995), Duxbury Press, Belmont, CA
- (1995) Mathematical Statistics and Data Analysis
- Rice, J.A.¹

31
- 0003584577
- Prentice Hall, Upper Saddle River, NJ
- Russell S., and Norvig P. Artificial Intelligence: A Modern Approach. second ed. (2003), Prentice Hall, Upper Saddle River, NJ
- (2003) Artificial Intelligence: A Modern Approach. second ed.
- Russell, S.¹ Norvig, P.²

32
- 46049084086
- Ph.D. Dissertation, The Ohio State University
- Shao, Y., 2007. Sequential organization in computational auditory scene analysis. Ph.D. Dissertation, The Ohio State University.
- (2007) Sequential organization in computational auditory scene analysis
- Shao, Y.¹

33
- 34547499683
- Incorporating auditory feature uncertainties in robust speaker identification
- Shao, Y., Srinivasan, S., Wang, D.L., 2007. Incorporating auditory feature uncertainties in robust speaker identification. In: Proc. ICASSP, Vol. IV, pp. 277-280.
- (2007) Proc. ICASSP , vol.4 , pp. 277-280
- Shao, Y.¹ Srinivasan, S.² Wang, D.L.³

34
- 33744996003
- Model-based sequential organization in cochannel speech
- Shao Y., and Wang D.L. Model-based sequential organization in cochannel speech. IEEE Trans. Audio Speech Lang. Process. 14 1 (2006) 289-298
- (2006) IEEE Trans. Audio Speech Lang. Process. , vol.14 , Issue.1 , pp. 289-298
- Shao, Y.¹ Wang, D.L.²

35
- 34047272127
- Average divergence distance as a statistical discrimination measure for hidden Markov models
- Silva J., and Narayanan S. Average divergence distance as a statistical discrimination measure for hidden Markov models. IEEE Trans. Audio Speech Lang. Process. 14 3 (2006) 890-906
- (2006) IEEE Trans. Audio Speech Lang. Process. , vol.14 , Issue.3 , pp. 890-906
- Silva, J.¹ Narayanan, S.²

36
- 56249136428
- Transforming binary uncertainties for robust speech recognition
- Srinivasan S., and Wang D.L. Transforming binary uncertainties for robust speech recognition. IEEE Trans. Audio Speech Lang. Process. 15 7 (2007) 2130-2140
- (2007) IEEE Trans. Audio Speech Lang. Process. , vol.15 , Issue.7 , pp. 2130-2140
- Srinivasan, S.¹ Wang, D.L.²

37
- 0027623210
- Assessment for automatic speech recognition: II. Noisex-92: a database and an experiment to study the effect of additive noise on speech recognition systems
- Varga A., and Steeneken H.J.M. Assessment for automatic speech recognition: II. Noisex-92: a database and an experiment to study the effect of additive noise on speech recognition systems. Speech Comm. 12 3 (1993) 247-251
- (1993) Speech Comm. , vol.12 , Issue.3 , pp. 247-251
- Varga, A.¹ Steeneken, H.J.M.²

38
- 3042623400
- On the efficient evaluation of probabilistic similarity functions for image retrieval
- Vasconcelos N. On the efficient evaluation of probabilistic similarity functions for image retrieval. IEEE Trans. Inform. Theory 50 7 (2004) 1482-1496
- (2004) IEEE Trans. Inform. Theory , vol.50 , Issue.7 , pp. 1482-1496
- Vasconcelos, N.¹

39
- 84892233308
- On ideal binary mask as the computational goal of auditory scene analysis
- Divenyi P. (Ed), Kluwer Academic, Norwell, MA
- Wang D.L. On ideal binary mask as the computational goal of auditory scene analysis. In: Divenyi P. (Ed). Speech Separation by Humans and Machines (2005), Kluwer Academic, Norwell, MA 181-197
- (2005) Speech Separation by Humans and Machines , pp. 181-197
- Wang, D.L.¹

40
- 85011300842
- Feature-based speech segregation
- Wang D.L., and Brown G.J. (Eds), Wiley-IEEE Press, Hoboken, NJ
- Wang D.L. Feature-based speech segregation. In: Wang D.L., and Brown G.J. (Eds). Computational Auditory Scene Analysis: Principles, Algorithms, and Applications (2006), Wiley-IEEE Press, Hoboken, NJ 81-114
- (2006) Computational Auditory Scene Analysis: Principles, Algorithms, and Applications , pp. 81-114
- Wang, D.L.¹

41
- 82255178542
- Wang D.L., and Brown G.J. (Eds), Wiley-IEEE Press, Hoboken, NJ
- In: Wang D.L., and Brown G.J. (Eds). Computational Auditory Scene Analysis: Principles, Algorithms, and Applications (2006), Wiley-IEEE Press, Hoboken, NJ
- (2006) Computational Auditory Scene Analysis: Principles, Algorithms, and Applications

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.