SCOPUS 정보 검색 플랫폼

IEEE Transactions on Audio, Speech and Language Processing

Volumn 21, Issue 2, 2013, Pages 367-377

Image feature representation of the subband power distribution for robust sound event classification

(3) Dennis, Jonathan a Tran, Huy Dat a Chng, Eng Siong b

a INSTITUTE FOR INFOCOMM RESEARCH (Singapore)

b NANYANG TECHNOLOGICAL UNIVERSITY (Singapore)

Author keywords

missing feature theory; Sound event classification; spectrogram; subband power distribution (SPD)

Indexed keywords

ACOUSTIC SURVEILLANCE; ENVIRONMENTAL SOUNDS; FEATURE CLASSIFICATION; IMAGE FEATURE REPRESENTATION; IMAGE FEATURES; MISSING FEATURE THEORIES; NEAREST NEIGHBOR CLASSIFIER; NOISE CONDITIONS; NONSTATIONARY NOISE; POWER DISTRIBUTIONS; SOUND EVENT CLASSIFICATION; SPECTRAL POWER DISTRIBUTION; SPECTROGRAMS; SUBBANDS;

ACOUSTIC NOISE; AUDITION; IMAGE PROCESSING; SPECTROGRAPHS; TWO DIMENSIONAL;

AUDIO ACOUSTICS;

EID: 84871391219 PISSN: 15587916 EISSN: None Source Type: Journal
DOI: 10.1109/TASL.2012.2226160 Document Type: Article

Times cited : (82)

References (35)

1
- 84863730848
- Scream and gunshot detection in noisy environments
- Poznan, Poland Sep. 3-7
- L. Gerosa, G. Valenzise, F. Antonacci, M. Tagliasacchi, and A. Sarti, "Scream and gunshot detection in noisy environments, " in Proc. 15th Eur. Signal Process. Conf., Poznan, Poland, Sep. 3-7, 2007.
- (2007) Proc. 15th Eur. Signal Process. Conf.
- Gerosa, L.¹ Valenzise, G.² Antonacci, F.³ Tagliasacchi, M.⁴ Sarti, A.⁵

2
- 80051605016
- Audio recognition in the wild: Static and dynamic classification on a real-world database of animal vocalizations
- F. Weninger and B. Schuller, "Audio recognition in the wild: Static and dynamic classification on a real-world database of animal vocalizations, " in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. (ICASSP), 2011, pp. 337-340.
- (2011) Proc IEEE Int. Conf. Acoust., Speech, Signal Process. (ICASSP , pp. 337-340
- Weninger, F.¹ Schuller, B.²

3
- 68149163531
- Environmental sound recognition with time-frequency audio features
- Aug
- S. Chu, S. Narayanan, and C. Kuo, "Environmental sound recognition with time-frequency audio features, " IEEE Trans. Audio, Speech, Lang. Process., vol. 17, no. 6, pp. 1142-1158, Aug. 2009.
- (2009) IEEE Trans. Audio, Speech, Lang. Process , vol.17 , Issue.6 , pp. 1142-1158
- Chu, S.¹ Narayanan, S.² Kuo, C.³

4
- 85008548582
- Time-frequency matrix feature extraction and classification of environmental audio signals
- Sep.
- B. Ghoraani and S. Krishnan, "Time-frequency matrix feature extraction and classification of environmental audio signals, " IEEE Trans. Audio, Speech, Lang. Process., vol. 19, no. 7, pp. 2197-2209, Sep. 2011.
- (2011) IEEE Trans. Audio, Speech, Lang. Process , vol.19 , Issue.7 , pp. 2197-2220
- Ghoraani, B.¹ Krishnan, S.²

5
- 85032753469
- Machine hearing: An emerging field
- Sep.
- R. Lyon, "Machine hearing: An emerging field, " IEEE Signal Process. Mag., vol. 27, no. 5, pp. 131-139, Sep. 2010.
- (2010) IEEE Signal Process. Mag , vol.27 , Issue.5 , pp. 131-139
- Lyon, R.¹

6
- 78650982481
- Spectrogram image feature for sound event classification in mismatched conditions
- J. Dennis, H. Tran, and H. Li, "Spectrogram image feature for sound event classification in mismatched conditions, " IEEE Signal Process. Lett., vol. 18, no. 2, pp. 130-133, 2011.
- (2011) IEEE Signal Process. Lett , vol.18 , Issue.2 , pp. 130-133
- Dennis, J.¹ Tran, H.² Li, H.³

7
- 84871375628
- Notes on spectrogram reading
- V. Zue, "Notes on spectrogram reading, " Mass. Inst. Tech. Course, 1985.
- (1985) Mass. Inst. Tech. Course
- Zue, V.¹

8
- 85032752225
- Missing-feature approaches in speech recognition
- DOI 10.1109/MSP.2005.1511828
- B. Raj and R. Stern, "Missing-feature approaches in speech recognition, " IEEE Signal Process. Mag., vol. 22, no. 5, pp. 101-116, Sep. 2005. (Pubitemid 41488524)
- (2005) IEEE Signal Processing Magazine , vol.22 , Issue.5 , pp. 101-116
- Raj, B.¹ Stern, R.M.²

9
- 84865804537
- Image representation of the subband power distribution for robust sound classification
- Aug
- J. Dennis, H. Tran, and H. Li, "Image representation of the subband power distribution for robust sound classification, " in Proc. 12 Annu. Conf. Int. Speech Commun. Assoc., Aug. 2011, pp. 2437-2440.
- (2011) Proc. 12 Annu. Conf. Int. Speech Commun. Assoc. , pp. 2437-2440
- Dennis, J.¹ Tran, H.² Li, H.³

10
- 0042830801
- Comparison of techniques for environmental sound recognition
- DOI 10.1016/S0167-8655(03)00147-8
- M. Cowling and R. Sitte, "Comparison of techniques for environmental sound recognition, " Pattern Recognit. Lett., vol. 24, no. 15, pp. 2895-2907, 2003. (Pubitemid 37027809)
- (2003) Pattern Recognition Letters , vol.24 , Issue.15 , pp. 2895-2907
- Cowling, M.¹ Sitte, R.²

11
- 77950315879
- Environment recognition from audio using mpeg-7 features
- G. Muhammad and K. Alghathbar, "Environment recognition from audio using mpeg-7 features, " in Proc. 4th IEEE Int. Conf. Embedded and Multimedia Comput. (EM-Com '09) , 2009, pp. 1-6.
- (2009) Proc. 4th IEEE Int. Conf. Embedded and Multimedia Comput. (EM-Com '09) , pp. 1-6
- Muhammad, G.¹ Alghathbar, K.²

12
- 34347345718
- Parametric representations of bird sounds for automatic species recognition
- Nov
- P. Somervuo, A. Harma, and S. Fagerlund, "Parametric representations of bird sounds for automatic species recognition, " IEEE Trans. Audio, Speech, Lang. Process., vol. 14, no. 6, pp. 2252-2263, Nov. 2006.
- (2006) IEEE Trans. Audio, Speech, Lang. Process , vol.14 , Issue.6 , pp. 2252-2263
- Somervuo, P.¹ Harma, A.² Fagerlund, S.³

13
- 76949107820
- Sound indexing using morphological description
- Mar.
- G. Peeters and E. Deruty, "Sound indexing using morphological description, " IEEE Trans. Audio, Speech, Lang. Process., vol. 18, no. 3, pp. 675-687, Mar. 2010.
- (2010) IEEE Trans. Audio, Speech, Lang. Process , vol.18 , Issue.3 , pp. 675-687
- Peeters, G.¹ Deruty, E.²

14
- 79957687384
- Sound event recognition with probabilistic distance SVMs
- Aug.
- H. Tran and L. Haizhou, "Sound event recognition with probabilistic distance SVMs, " IEEE Trans. Audio, Speech, Lang. Process., vol. 19, no. 6, pp. 1556-1568, Aug. 2011.
- (2011) IEEE Trans. Audio, Speech, Lang. Process , vol.19 , Issue.6 , pp. 1556-1568
- Tran, H.¹ Haizhou, L.²

15
- 14244272507
- Methods for capturing spectro-temporal modulations in automatic speech recognition
- M. Kleinschmidt, "Methods for capturing spectro-temporal modulations in automatic speech recognition, " Acta Acustica United With Acustica, vol. 88, no. 3, pp. 416-422, 2002. (Pubitemid 34732124)
- (2002) Acta Acustica united with Acustica , vol.88 , Issue.3 , pp. 416-422
- Kleinschmidt, M.¹

16
- 0033709098
- Tandem connectionist feature extraction for conventional hmm systems
- H. Hermansky, D. Ellis, and S. Sharma, "Tandem connectionist feature extraction for conventional hmm systems, " in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. (ICASSP '00), 2000, vol. 3, pp. 1635-1638.
- (2000) Proc IEEE Int. Conf. Acoust., Speech, Signal Process. (ICASSP '00 , vol.3 , pp. 1635-1638
- Hermansky, H.¹ Ellis, D.² Sharma, S.³

17
- 85085181752
- Classification of music signals in the visual domain
- H. Deshpande, R. Singh, and U. Nam, "Classification of music signals in the visual domain, " in Proc. COST-G6 Conf. Digital Audio Effects, 2001.
- (2001) Proc. COST-G6 Conf. Digital Audio Effects
- Deshpande, H.¹ Singh, R.² Nam, U.³

18
- 84871378577
- Audio classification using acoustic images for retrieval from multimedia databases
- I. Paraskevas and E. Chilton, "Audio classification using acoustic images for retrieval from multimedia databases, " in Proc. 4th EURASIP Conf. Focused Video/Image Process. and Multimedia Commun., 2003, vol. 1, pp. 187-192.
- (2003) Proc. 4th EURASIP Conf. Focused Video/Image Process. and Multimedia Commun , vol.1 , pp. 187-192
- Paraskevas, I.¹ Chilton, E.²

19
- 70349205535
- Audio classification from time-frequency texture
- G. Yu and J. Slotine, "Audio classification from time-frequency texture, " in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process., (ICASSP '09), 2009, pp. 1677-1680.
- (2009) Proc IEEE Int. Conf. Acoust., Speech, Signal Process., (ICASSP '09 , pp. 1677-1680
- Yu, G.¹ Slotine, J.²

20
- 84863744672
- Gradient-based musical feature extraction based on scale-invariant feature transform
- T. Matsui, M. Goto, J. Vert, and Y. Uchiyama, "Gradient-based musical feature extraction based on scale-invariant feature transform, " in Proc. 19th Eur. Signal Process. Conf., 2011, pp. 724-728.
- (2011) Proc. 19th Eur. Signal Process. Conf , pp. 724-728
- Matsui, T.¹ Goto, M.² Vert, J.³ Uchiyama, Y.⁴

21
- 0035342414
- Robust automatic speech recognition with missing and unreliable acoustic data
- DOI 10.1016/S0167-6393(00)00034-0, PII S0167639300000340
- M. Cooke, P. Green, L. Josifovski, and A. Vizinho, "Robust automatic speech recognition with missing and unreliable acoustic data, " Speech Commun., vol. 34, no. 3, pp. 267-285, 2001. (Pubitemid 32284867)
- (2001) Speech Communication , vol.34 , Issue.3 , pp. 267-285
- Cooke, M.¹ Green, P.² Josifovski, L.³ Vizinho, A.⁴

22
- 4644317224
- A Bayesian classifier for spectrographic mask estimation for missing feature speech recognition
- M. Seltzer, B. Raj, and R. Stern, "A Bayesian classifier for spectrographic mask estimation for missing feature speech recognition, " Speech Commun., vol. 43, no. 4, pp. 379-393, 2004.
- (2004) Speech Commun , vol.43 , Issue.4 , pp. 379-393
- Seltzer, M.¹ Raj, B.² Stern, R.³

23
- 0141624530
- An efficient auditory filterbank based on the gammatone function
- R. Patterson, I. Nimmo-Smith, J. Holdsworth, and P. Rice, "An efficient auditory filterbank based on the gammatone function, " APU Rep., 1988, vol. 2341.
- (1988) APU Rep , pp. 2341
- Patterson, R.¹ Nimmo-Smith, I.² Holdsworth, J.³ Rice, P.⁴

24
- 0003913694
- An efficient implementation of the Patterson-Holdsworth auditory filter bank
- Tech. Rep.
- M. Slaney, "An efficient implementation of the Patterson-Holdsworth auditory filter bank, " Apple Computer, 1993, Tech. Rep. .
- (1993) Apple Computer
- Slaney, M.¹

25
- 0003626435
- Upper Saddle River NJ Prentice-Hall ISBN 0-201-18075-8
- R. Gonzalez and R. Woods, Digital Image Processing. Upper Saddle River, NJ: Prentice-Hall, 2002, ISBN 0-201-18075-8.
- (2002) Digital Image Processing
- Gonzalez, R.¹ Woods, R.²

26
- 3042535216
- Distinctive image features from scale-invariant keypoints
- D. Lowe, "Distinctive image features from scale-invariant keypoints, " Int. J. Comput. Vis., vol. 60, no. 2, pp. 91-110, 2004.
- (2004) Int. J. Comput. Vis , vol.60 , Issue.2 , pp. 91-110
- Lowe, D.¹

27
- 0018455310
- Suppression of acoustic noise in speech using spectral subtraction
- S. Boll, "Suppression of acoustic noise in speech using spectral subtraction, " IEEE Trans. Acoust., Speech, Signal Process., vol. ASSP-27, no. 2, pp. 113-120, Apr. 1979. (Pubitemid 9467471)
- (1979) IEEE Trans Acoust Speech Signal Process , vol.ASSP-27 , Issue.2 , pp. 113-120
- Boll Steven, F.¹

28
- 0024753593
- Speech recognition using noise-adaptive prototypes
- DOI 10.1109/29.35387
- A. Nádas, D. Nahamoo, and M. Picheny, "Speech recognition using noise-adaptive prototypes, " IEEE Trans. Acoust., Speech, Signal Process., vol. 37, no. 10, pp. 1495-1503, Oct. 1989. (Pubitemid 20617876)
- (1989) IEEE Transactions on Acoustics, Speech, and Signal Processing , vol.37 , Issue.10 , pp. 1495-1503
- Nadas Arthur¹ Nahamoo David² Picheny Michael, A.³

29
- 51949090223
- In defense of nearestneighbor based image classification
- O. Boiman, E. Shechtman, and M. Irani, "In defense of nearestneighbor based image classification, " in Proc. IEEE Comput. Vision, Pattern Recognit. (CVPR), 2008, pp. 1-8.
- (2008) Proc IEEE Comput. Vision, Pattern Recognit. (CVPR , pp. 1-8
- Boiman, O.¹ Shechtman, E.² Irani, M.³

30
- 84856621489
- Hellinger distance decision trees are robust and skew-insensitive
- D. A. Cieslak, T. R. Hoens, N. V. Chawla, and W. P. Kegelmeyer, "Hellinger distance decision trees are robust and skew-insensitive, " Data Min. Knowl. Discov., pp. 136-158, 2012.
- (2012) Data Min. Knowl. Discov , pp. 136-158
- Cieslak, D.A.¹ Hoens, T.R.² Chawla, N.V.³ Kegelmeyer, W.P.⁴

31
- 78049391669
- Acoustical sound database in real environments for sound scene understanding and hands-free speech recognition
- S. Nakamura, K. Hiyane, F. Asano, T. Nishiura, and T. Yamada, "Acoustical sound database in real environments for sound scene understanding and hands-free speech recognition, " in Proc. ICLRE, 2000, pp. 965-968.
- (2000) Proc. ICLRE , pp. 965-968
- Nakamura, S.¹ Hiyane, K.² Asano, F.³ Nishiura, T.⁴ Yamada, T.⁵

32
- 33745185408
- version 1. 1 ETSI STQ Aurora DSRWorking Group Tech. Rep. ES
- A. Sorin and T. Ramabadran, "Extended advanced front end algorithm description, version 1. 1 ETSI STQ Aurora DSRWorking Group, 2003, vol. 202, Tech. Rep. ES, p. 212.
- (2003) Extended Advanced Front End Algorithm Description , vol.202 , pp. 212
- Sorin, A.¹ Ramabadran, T.²

33
- 0003822743
- S. Young, G. Evermann, D. Kershaw, G. Moore, J. Odell, D. Ollason, V. Valtchev, and P. Woodland, The HTK Book.
- The HTK Book
- Young, S.¹ Evermann, G.² Kershaw, D.³ Moore, G.⁴ Odell, J.⁵ Ollason, D.⁶ Valtchev, V.⁷ Woodland, P.⁸

34
- 0027623210
- Assessment for automatic speech recognition: II. NOISEX-92: A database and an experiment to study the effect of additive noise on speech recognition systems
- A. Varga and H. Steeneken, "Assessment for automatic speech recognition: II. NOISEX-92: A database and an experiment to study the effect of additive noise on speech recognition systems, " Speech Commun., vol. 12, no. 3, pp. 247-251, 1993.
- (1993) Speech Commun , vol.12 , Issue.3 , pp. 247-251
- Varga, A.¹ Steeneken, H.²

35
- 84871381773
- [Online]. Available
- J. Barker, M. Cooke, and D. P. Ellis, The RESPITE CASA Toolkit v1. 3. 5 [Online]. Available: http://staffwww. dcs. shef. ac. uk/people/J. Barker/ctk. html 2002
- (2002) The RESPITE CASA Toolkit v1 , vol.3 , pp. 5
- Barker, J.¹ Cooke, M.² Ellis, D.P.³

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.