SCOPUS 정보 검색 플랫폼

IEEE Transactions on Audio, Speech and Language Processing

Volumn 19, Issue 6, 2011, Pages 1556-1568

Sound event recognition with probabilistic distance SVMs

a INSTITUTE FOR INFOCOMM RESEARCH (Singapore)

Author keywords

Divergence distance; probabilistic distance; sound characterization; sound event recognition; subband temporal envelope (STE); support vector machine (SVM)

Indexed keywords

DIVERGENCE DISTANCE; PROBABILISTIC DISTANCE; SOUND CHARACTERIZATION; SOUND EVENT RECOGNITION; SUB-BANDS;

GEARS; PROBABILITY DISTRIBUTIONS; SPEECH RECOGNITION; SUPPORT VECTOR MACHINES;

AUDIO ACOUSTICS;

EID: 79957687384 PISSN: 15587916 EISSN: None Source Type: Journal
DOI: 10.1109/TASL.2010.2093519 Document Type: Article

Times cited : (58)

References (40)

1
- 0004244302
- Englewood Cliffs, NJ: Prentice-Hall
- L. Rabiner and B. H. Juang, Fundamentals of Speech Recognition. Englewood Cliffs, NJ: Prentice-Hall, 1993.
- (1993) Fundamentals of Speech Recognition
- Rabiner, L.¹ Juang, B.H.²

2
- 70449615870
- New York: Springer
- H. Beigi, Fundamentals of Speaker Recognition. New York: Springer, 2009.
- (2009) Fundamentals of Speaker Recognition
- Beigi, H.¹

3
- 0030242072
- Content-based classification, search, and retrieval of audio
- E. Wold, T. Blum, D. Keislar, and J. Wheaton, "Content-based classification, search, and retrieval of audio," IEEE Multimedia, vol. 3, no. 3, pp. 27-36, Fall, 1996. (Pubitemid 126571576)
- (1996) IEEE Multimedia , vol.3 , Issue.3 , pp. 27-36
- Wold, E.¹ Blum, T.² Keislar, D.³ Wheaton, J.⁴

4
- 0141855203
- Automatic identification of bird species based on sinusoidal modeling of syllables
- A. Härmä, "Automatic identification of bird species based on sinusoidal modeling of syllables," in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process., 2003, pp. 545-548.
- (2003) Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. , pp. 545-548
- Härmä, A.¹

5
- 34347345718
- Parametric representations of bird sounds for automatic species recognition
- P. Somervuo, A. Härmä, and S. Fagerlund, "Parametric representations of bird sounds for automatic species recognition," IEEE Trans. Speech Audio Process., vol. 14, pp. 2252-2263, 2006.
- (2006) IEEE Trans. Speech Audio Process. , vol.14 , pp. 2252-2263
- Somervuo, P.¹ Härmä, A.² Fagerlund, S.³

6
- 34547518478
- Acoustic monitoring of singing insects
- Ganchev et al., "Acoustic monitoring of singing insects," in Proc. Conf. Acoust., Speech, Signal Process. (ICASSP), 2007, pp. 721-724.
- (2007) Proc. Conf. Acoust., Speech, Signal Process. (ICASSP) , pp. 721-724
- Ganchev¹

7
- 68149163531
- Environmental sound recognition with time-frequency audio features
- Aug.
- S. Chu, S. Narayanan, and C. J. Kuo, "Environmental sound recognition with time-frequency audio features," Trans. Audio, Speech, Lang. Process., vol. 17, no. 6, pp. 1142-1158, Aug. 2009.
- (2009) Trans. Audio, Speech, Lang. Process. , vol.17 , Issue.6 , pp. 1142-1158
- Chu, S.¹ Narayanan, S.² Kuo, C.J.³

8
- 0036648502
- Musical genre classification of audio signals
- DOI 10.1109/TSA.2002.800560, PII 1011092002800560
- G. Tzanetakis and P. Cook, "Musical genre classification of audio signals," IEEE Trans. Speech Audio Process., vol. 10, no. 5, pp. 293-302, Jul. 2002. (Pubitemid 34950067)
- (2002) IEEE Transactions on Speech and Audio Processing , vol.10 , Issue.5 , pp. 293-302
- Tzanetakis, G.¹ Cook, P.²

9
- 17444399233
- Musical instrument timbres classification with spectral features
- DOI 10.1155/S1110865703210118
- G. Agostini, M. Longari, and E. Pollastri, "Musical instrument timbres classification with spectral features," EURASIP J. Appl. Signal Process. , pp. 5-14, Jan. 2003. (Pubitemid 41283787)
- (2003) Eurasip Journal on Applied Signal Processing , vol.2003 , Issue.1 , pp. 5-14
- Agostini, G.¹ Longari, M.² Pollastri, E.³

10
- 76949083398
- Dynamic spectral envelope modeling for timbre analysis of musical instrument sounds
- Mar.
- J. J. Burred, A. Röbel, and T. Sikora, "Dynamic spectral envelope modeling for timbre analysis of musical instrument sounds," Trans. Audio, Speech, Lang. Process., vol. 18, no. 3, pp. 663-674, Mar. 2010.
- (2010) Trans. Audio, Speech, Lang. Process. , vol.18 , Issue.3 , pp. 663-674
- Burred, J.J.¹ Röbel, A.² Sikora, T.³

11
- 44849144388
- Scream and gunshot detection and localization for audio-surveillance systems
- G. Valenzise, L. Gerosa, M. Tagliasacchi, F. Antonacci, and A. Sarti, "Scream and gunshot detection and localization for audio-surveillance systems," in Proc. IEEE Conf. Adv. Video Signal Based Surveill., 2007, pp. 21-26.
- (2007) Proc. IEEE Conf. Adv. Video Signal Based Surveill. , pp. 21-26
- Valenzise, G.¹ Gerosa, L.² Tagliasacchi, M.³ Antonacci, F.⁴ Sarti, A.⁵

12
- 0035758890
- Footstep detection and tracking
- G. Succi, D. Clapp, R. Gampert, and G. Prado, "Footstep detection and tracking," in Proc. SPIE, 2001, vol. 4393, pp. 22-26.
- (2001) Proc. SPIE , vol.4393 , pp. 22-26
- Succi, G.¹ Clapp, D.² Gampert, R.³ Prado, G.⁴

13
- 33749069115
- Audio analysis for surveillance applications
- DOI 10.1109/ASPAA.2005.1540194, 1540194, 2005 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics
- R. Radhakrishnan, A. Divakaran, and P. Smaragdis, "Audio analysis for surveillance applications," in Proc. IEEE Workshop Applicat. Signal Process. Audio Acoust., 2005, pp. 158-161. (Pubitemid 44461818)
- (2005) IEEE Workshop on Applications of Signal Processing to Audio and Acoustics , pp. 158-161
- Radhakrishnan, R.¹ Divakaran, A.² Smaragdis, P.³

14
- 79957753290
- Proc. (RT-07) Rich Transcription Meeting Recognition Evaluation Plan, [Online]. Available: http://www.nist.gov/speech/tests/rt/rt2007
- Proc. (RT-07) Rich Transcription Meeting Recognition Evaluation Plan

15
- 79957699455
- Southampton, U.K., Apr. 6-7, 2006, Revised Selected Papers, Lecture Notes in Computer Science Springer Berlin/Heidelberg, 978-3-540-69567-7
- in Proc. Multimodal Technologies for Perception of Humans First International EvaluationWorkshop on Classification of Events, Activities and Relationships, CLEAR 2006, Southampton, U.K., Apr. 6-7, 2006, vol. 4122/2007, Revised Selected Papers, Lecture Notes in Computer Science Springer Berlin/Heidelberg, 978-3-540-69567-7.
- (2007) Proc. Multimodal Technologies for Perception of Humans First International EvaluationWorkshop on Classification of Events, Activities and Relationships, CLEAR 2006 , vol.4122

16
- 79957754959
- Baltimore, MD, May 8-11, 2007, Revised Selected Papers, Lecture Notes in Computer Science Springer Berlin/Heidelberg, 978-3- 540-68584-5
- in Proc. Multimodal Technologies for Perception of Humans International Evaluation Workshops CLEAR 2007 and RT 2007, Baltimore, MD, May 8-11, 2007, vol. 4625/2009, Revised Selected Papers, Lecture Notes in Computer Science Springer Berlin/Heidelberg, 978-3- 540-68584-5.
- (2009) Proc. Multimodal Technologies for Perception of Humans International Evaluation Workshops CLEAR 2007 and RT 2007 , vol.4625

17
- 0032828464
- A model of auditory perception as front end for automatic speech recognition
- J. Tchorz and B. Kollmeier, "A model of auditory perception as front end for automatic speech recognition," J. Acoust. Soc. Amer., vol. 106, no. 4, pp. 2040-2050, 1999.
- (1999) J. Acoust. Soc. Amer. , vol.106 , Issue.4 , pp. 2040-2050
- Tchorz, J.¹ Kollmeier, B.²

18
- 70349223037
- An auditory-based feature for robust speech recognition
- Y. Shao, Z. Jin, D. Wang, and S. Srinivasan, "An auditory-based feature for robust speech recognition," in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process., 2009, pp. 4625-4628.
- (2009) Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. , pp. 4625-4628
- Shao, Y.¹ Jin, Z.² Wang, D.³ Srinivasan, S.⁴

19
- 84898982939
- Exploiting generative models in discriminative classifiers
- Cambridge, MA: MIT Press
- T. Jaakkola and D. Haussler, "Exploiting generative models in discriminative classifiers," in Advances in Neural Information Processing Systems. Cambridge, MA: MIT Press, 1998, vol. 11, pp. 487-493.
- (1998) Advances in Neural Information Processing Systems. , vol.11 , pp. 487-493
- Jaakkola, T.¹ Haussler, D.²

20
- 14644412368
- Speaker verification using sequence discriminant support vector machines
- DOI 10.1109/TSA.2004.841042
- V. Wan and S. Renals, "Speaker verification using sequence discriminant support vector machines," IEEE Trans. Speech Audio Process., vol. 13, no. 2, pp. 203-210, Mar. 2005. (Pubitemid 40320239)
- (2005) IEEE Transactions on Speech and Audio Processing , vol.13 , Issue.2 , pp. 203-210
- Wan, V.¹ Renals, S.²

21
- 17644380073
- A Kullback-Leibler divergence based kernel for svm classification in multimedia applications
- Cambridge, MA: MIT Press
- P. J. Moreno, P. P. Ho, and N. Vasconcelos, "A Kullback-Leibler divergence based kernel for svm classification in multimedia applications," in Advances in Neural Information Processing Systems. Cambridge, MA: MIT Press, 2003, vol. 16.
- (2003) Advances in Neural Information Processing Systems , vol.16
- Moreno, P.J.¹ Ho, P.P.² Vasconcelos, N.³

22
- 0034313871
- Earth mover's distance as a metric for image retrieval
- DOI 10.1023/A:1026543900054
- Y. Rubner, C. Tomasi, and L. Guibas, "The Earth Mover's distance as a metric for image retrieval," Int. J. Comput. Vis., vol. 40, no. 2, pp. 99-121, 2000. (Pubitemid 32136368)
- (2000) International Journal of Computer Vision , vol.40 , Issue.2 , pp. 99-121
- Rubner, Y.¹ Tomasi, C.² Guibas, L.J.³

23
- 0032594951
- Support vector machines for histogram-based image classification
- Sep.
- O. Chapelle, P. Haffner, and V. Vapnik, "Support vector machines for histogram-based image classification," IEEE Trans. Neural Netw., vol. 10, no. 5, pp. 1055-1064, Sep. 1999.
- (1999) IEEE Trans. Neural Netw. , vol.10 , Issue.5 , pp. 1055-1064
- Chapelle, O.¹ Haffner, P.² Vapnik, V.³

24
- 9444269199
- Bhattacharyya and expected likelihood kernels
- Learning Theory and Kernel Machines
- T. Jebara and R. Kondor, "Bhattacharyya and expected likelihood kernels," Lecture Notes in Computer Science, vol. 2777, pp. 57-71, 2003. (Pubitemid 37053195)
- (2003) Lecture Notes in Computer Science , Issue.2777 , pp. 57-71
- Jebara, T.¹ Kondor, R.²

25
- 33645887246
- Support vector machines using GMM supervectors for speaker verification
- May
- W. M. Campbell, D. E. Sturim, and D. A. Reynolds, "Support vector machines using GMM supervectors for speaker verification," IEEE Signal Process. Lett., vol. 13, no. 5, pp. 308-311, May 2006.
- (2006) IEEE Signal Process. Lett. , vol.13 , Issue.5 , pp. 308-311
- Campbell, W.M.¹ Sturim, D.E.² Reynolds, D.A.³

26
- 85008056687
- A SVM Kernel with GMM-supervector based on the Bhattacharyya distance for speaker recognition
- C. H. You, K.-A. Lee, and H. Li, "A SVM Kernel with GMM-supervector based on the Bhattacharyya distance for speaker recognition," IEEE Signal Process. Lett., vol. 16, no. 1, pp. 49-52, 2009.
- (2009) IEEE Signal Process. Lett. , vol.16 , Issue.1 , pp. 49-52
- You, C.H.¹ Lee, K.-A.² Li, H.³

27
- 0003450542
- New York: Springer-Verlag
- V. N. Vapnik, The Nature of Statistical Learning Theory. Information Science and Statistics. New York: Springer-Verlag, 2000.
- (2000) The Nature of Statistical Learning Theory. Information Science and Statistics
- Vapnik, V.N.¹

28
- 0035789613
- Proximal support vector machine classifiers
- San Francisco, CA
- G. Fung and O. L. Mangasarian, "Proximal support vector machine classifiers," in Proc. KDD-2001: Knowledge Discovery and Data Mining, August 26-29, 2001, San Francisco, CA, 2001, pp. 77-86.
- (2001) Proc. KDD-2001: Knowledge Discovery and Data Mining, August 26-29, 2001 , pp. 77-86
- Fung, G.¹ Mangasarian, O.L.²

29
- 0004236492
- Baltimore, MD: John Hopkins Univ. Press
- G. H. Golub and C. F. Van Loan, Matrix Computations. Baltimore, MD: John Hopkins Univ. Press, 1996.
- (1996) Matrix Computations
- Golub, G.H.¹ Van Loan, C.F.²

30
- 30344461377
- J. R. Movellan, Tutorial on Gabor Filters [Online]. Available: www. mplab.ucsd.edu/tutorials/gabor.pdf
- Tutorial on Gabor Filters
- Movellan, J.R.¹

31
- 0003873095
- Cambridge, MA: Birkhauser
- A. Teolis, Computational Signal Processing with Wavelets. Cambridge, MA: Birkhauser, 1998, p. 65.
- (1998) Computational Signal Processing with Wavelets , pp. 65
- Teolis, A.¹

32
- 33646791902
- Generalized Gamma modeling of speech and its online estimation for speech enhancement
- T. H. Dat, K. Takeda, and F. Itakura, "Generalized Gamma modeling of speech and its online estimation for speech enhancement," in Proc. 30th IEEE Int. Conf. Acoust. Speech Signal Process., ICASSP'05, 2005, vol. 4, pp. 181-184.
- (2005) Proc. 30th IEEE Int. Conf. Acoust. Speech Signal Process., ICASSP'05 , vol.4 , pp. 181-184
- Dat, T.H.¹ Takeda, K.² Itakura, F.³

33
- 33947678069
- Multichannel speech enhancement based on speech spectral magnitude estimation using generalized gamma prior distribution
- T. H. Dat, K. Takeda, and F. Itakura, "Multichannel speech enhancement based on speech spectral magnitude estimation using generalized gamma prior distribution," in Proc. 31th IEEE Int. Conf. Acoust. Speech Signal Process., ICASSP'06, vol. 4, pp. 439-447.
- Proc. 31th IEEE Int. Conf. Acoust. Speech Signal Process., ICASSP'06 , vol.4 , pp. 439-447
- Dat, T.H.¹ Takeda, K.² Itakura, F.³

34
- 0000238336
- A simplex method for function minimization
- J. A. Nelder and R. Mead, "A simplex method for function minimization," Comput. J., vol. 7, pp. 308-313, 1965.
- (1965) Comput. J. , vol.7 , pp. 308-313
- Nelder, J.A.¹ Mead, R.²

35
- 0033884858
- Speaker verification using adapted Gaussian mixture models
- DOI 10.1006/dspr.1999.0361
- D. A. Reynolds, T. F. Quatieri, and B. D. Robert, "Speaker verification using adapted Gaussian mixture models," Digital Signal Process., vol. 10, pp. 19-41, 2000. (Pubitemid 30592166)
- (2000) Digital Signal Processing: A Review Journal , vol.10 , Issue.1 , pp. 19-41
- Reynolds, D.A.¹ Quatieri, T.F.² Dunn, R.B.³

36
- 70149103919
- "Calculators and the Gamma Function," [Online]. Available: http:// www.rskey.org/gamma.htm
- Calculators and the Gamma Function

37
- 85037096112
- "Sound Effect Collections," [Online]. Available: http://www.soundideas. com/
- Sound Effect Collections

38
- 33750380834
- On-line Gaussian mixture modeling in the log-power domain for signal-to-noise ratio estimation and speech enhancement
- DOI 10.1016/j.specom.2006.06.009, PII S016763930600080X
- T. H. Dat, K. Takeda, and F. Itakura, "On-line Gaussian mixture modeling in the log-power domain for signal-to-noise ratio estimation and speech enhancement," Speech Commun., vol. 48, no. 11, pp. 1515-1527, 2006. (Pubitemid 44634771)
- (2006) Speech Communication , vol.48 , Issue.11 , pp. 1515-1527
- Dat, T.H.¹ Takeda, K.² Itakura, F.³

39
- 84921750206
- "Statistical Pattern Recognition Toolbox," [Online]. Available: http:// cmp.felk.cvut.cz/cmp/software/stprtool/
- Statistical Pattern Recognition Toolbox

40
- 0028430538
- Fast Gabor-like windowed Fourier and continuous wavelet transforms
- May
- M. Unser, "Fast Gabor-like windowed Fourier and continuous wavelet transforms," IEEE Signal Process. Lett., vol. 1, no. 5, pp. 76-79, May 1994.
- (1994) IEEE Signal Process. Lett. , vol.1 , Issue.5 , pp. 76-79
- Unser, M.¹

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.