SCOPUS 정보 검색 플랫폼

IEEE Signal Processing Letters

Volumn 24, Issue 3, 2017, Pages 279-283

Deep Convolutional Neural Networks and Data Augmentation for Environmental Sound Classification

(2) Salamon, Justin a Bello, Juan Pablo b

a NEW YORK UNIVERSITY (United States)

b NEW YORK UNIVERSITY (United States)

Author keywords

Deep convolutional neural networks (CNNs); deep learning; environmental sound classification; urban sound dataset

Indexed keywords

ACOUSTIC SIGNAL PROCESSING; CLASSIFICATION (OF INFORMATION); CONVOLUTION; DEEP LEARNING; NETWORK ARCHITECTURE; NEURAL NETWORKS;

CLASSIFICATION ACCURACY; CONVOLUTIONAL NEURAL NETWORK; DATA AUGMENTATION; DICTIONARY LEARNING; ENVIRONMENTAL SOUND CLASSIFICATIONS; PRIMARY CONTRIBUTION; TEMPORAL PATTERN; URBAN SOUND DATASET;

DEEP NEURAL NETWORKS;

EID: 85015238568 PISSN: 10709908 EISSN: None Source Type: Journal
DOI: 10.1109/LSP.2017.2657381 Document Type: Article

Times cited : (1333)

References (32)

1
- 68149163531
- Environmental sound recognition with time-frequency audio features
- Aug.
- S. Chu, S. Narayanan, and C.-C. Kuo, "Environmental sound recognition with time-frequency audio features," IEEE Trans. Audio, Speech, Language Process., vol. 17, no. 6, pp. 1142-1158, Aug. 2009
- (2009) IEEE Trans. Audio, Speech, Language Process. , vol.17 , Issue.6 , pp. 1142-1158
- Chu, S.¹ Narayanan, S.² Kuo, C.-C.³

2
- 33749069115
- Audio analysis for surveillance applications
- New Paltz, NY, USA, Oct.
- R. Radhakrishnan, A. Divakaran, and P. Smaragdis, "Audio analysis for surveillance applications," in Proc. IEEE Workshop Appl. Signal Process. Audio Acoust., New Paltz, NY, USA, Oct. 2005, pp. 158-161
- (2005) Proc. IEEE Workshop Appl. Signal Process. Audio Acoust. , pp. 158-161
- Radhakrishnan, R.¹ Divakaran, A.² Smaragdis, P.³

3
- 84997327948
- The implementation of low-cost urban acoustic monitoring devices
- C. Mydlarz, J. Salamon, and J. P. Bello, "The implementation of low-cost urban acoustic monitoring devices," Appl. Acoust., vol. 117, pp. 207-218, 2016
- (2016) Appl. Acoust. , vol.117 , pp. 207-218
- Mydlarz, C.¹ Salamon, J.² Bello, J.P.³

4
- 84946032854
- Sound event detection in real life recordings using coupled matrix factorization of spectral representations and class activity annotations
- Brisbane, Australia, Apr.
- A. Mesaros, T. Heittola, O. Dikmen, and T. Virtanen, "Sound event detection in real life recordings using coupled matrix factorization of spectral representations and class activity annotations," in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process., Brisbane, Australia, Apr. 2015, pp. 151-155
- (2015) Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. , pp. 151-155
- Mesaros, A.¹ Heittola, T.² Dikmen, O.³ Virtanen, T.⁴

5
- 84973400179
- Detection of overlapping acoustic events using a temporally-constrained probabilistic model
- Shanghai, China, Mar.
- E. Benetos, G. Lafay, M. Lagrange, and M. D. Plumbley, "Detection of overlapping acoustic events using a temporally-constrained probabilistic model," in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process., Shanghai, China, Mar. 2016, pp. 6450-6454
- (2016) Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. , pp. 6450-6454
- Benetos, E.¹ Lafay, G.² Lagrange, M.³ Plumbley, M.D.⁴

6
- 84973279159
- Acoustic scene classification with matrix factorization for unsupervised feature learning
- Shanghai, China, Mar.
- V. Bisot, R. Serizel, S. Essid, and G. Richard, "Acoustic scene classification with matrix factorization for unsupervised feature learning," in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process., Shanghai, China, Mar. 2016, pp. 6445-6449
- (2016) Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. , pp. 6445-6449
- Bisot, V.¹ Serizel, R.² Essid, S.³ Richard, G.⁴

7
- 84946051287
- Unsupervised feature learning for urban sound classification
- Brisbane, Australia, Apr.
- J. Salamon and J. P. Bello, "Unsupervised feature learning for urban sound classification," in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process., Brisbane, Australia, Apr. 2015, pp. 171-175
- (2015) Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. , pp. 171-175
- Salamon, J.¹ Bello, J.P.²

8
- 84963983684
- Feature learning with deep scattering for urban sound analysis
- Nice, France, Aug.
- J. Salamon and J. P. Bello, "Feature learning with deep scattering for urban sound analysis," in Proc. 2015 23rd Eur. Signal Process. Conf., Nice, France, Aug. 2015, pp. 724-728
- (2015) Proc. 2015 23rd Eur. Signal Process. Conf. , pp. 724-728
- Salamon, J.¹ Bello, J.P.²

9
- 84963954566
- Improving event detection for audio surveillance using gabor filterbank features
- Nice, France, Aug.
- J. T. Geiger and K. Helwani, "Improving event detection for audio surveillance using gabor filterbank features," in Proc. 23rd Eur. Signal Process. Conf., Nice, France, Aug. 2015, pp. 714-718
- (2015) Proc. 23rd Eur. Signal Process. Conf. , pp. 714-718
- Geiger, J.T.¹ Helwani, K.²

10
- 84951103511
- Polyphonic sound event detection using multi label deep neural networks
- Jul.
- E. Cakir, T. Heittola, H. Huttunen, and T. Virtanen, "Polyphonic sound event detection using multi label deep neural networks," in Proc. 2015 Int. Joint Conf. Neural Netw., Jul. 2015, pp. 1-7
- (2015) Proc. 2015 Int. Joint Conf. Neural Netw. , pp. 1-7
- Cakir, E.¹ Heittola, T.² Huttunen, H.³ Virtanen, T.⁴

11
- 84960857918
- Environmental sound classification with convolutional neural networks
- Boston, MA, USA, Sep.
- K. J. Piczak, "Environmental sound classification with convolutional neural networks," in Proc. 25th Int. Workshop Mach. Learning Signal Process., Boston, MA, USA, Sep. 2015, pp. 1-6
- (2015) Proc. 25th Int. Workshop Mach. Learning Signal Process. , pp. 1-6
- Piczak, K.J.¹

12
- 84893548504
- Detection and classification of acoustic scenes and events: An IEEE AASP challenge
- New Paltz, NY, USA, Oct.
- D. Giannoulis, E. Benetos, D. Stowell, M. Rossignol, M. Lagrange, and M. D. Plumbley, "Detection and classification of acoustic scenes and events: An IEEE AASP challenge," in Proc. IEEE Workshop Appl. Signal Process. Audio Acoust., New Paltz, NY, USA, Oct. 2013, pp. 1-4
- (2013) Proc. IEEE Workshop Appl. Signal Process. Audio Acoust. , pp. 1-4
- Giannoulis, D.¹ Benetos, E.² Stowell, D.³ Rossignol, M.⁴ Lagrange, M.⁵ Plumbley, M.D.⁶

13
- 84960474809
- Detection and classification of acoustic scenes and events
- Oct.
- D. Stowell, D. Giannoulis, E. Benetos, M. Lagrange, and M. D. Plumbley, "Detection and classification of acoustic scenes and events," IEEE Trans. Multimedia, vol. 17, no. 10, pp. 1733-1746, Oct. 2015
- (2015) IEEE Trans. Multimedia , vol.17 , Issue.10 , pp. 1733-1746
- Stowell, D.¹ Giannoulis, D.² Benetos, E.³ Lagrange, M.⁴ Plumbley, M.D.⁵

14
- 84990862424
- Automatic environmental sound recognition: Performance versus computational cost
- Nov.
- S. Sigtia, A. Stark, S. Krstulovic, and M. Plumbley, "Automatic environmental sound recognition: Performance versus computational cost," IEEE/ACM Trans. Audio, Speech, Language Process., vol. 24, no. 11, pp. 2096-2107, Nov. 2016
- (2016) IEEE/ACM Trans. Audio, Speech, Language Process. , vol.24 , Issue.11 , pp. 2096-2107
- Sigtia, S.¹ Stark, A.² Krstulovic, S.³ Plumbley, M.⁴

15
- 0032203257
- Gradient-based learning applied to document recognition
- Nov.
- Y. Lecun, L. Bottou, Y. Bengio, and P. Haffner, "Gradient-based learning applied to document recognition," Proc. IEEE, vol. 86, no. 11, pp. 2278-2324, Nov. 1998
- (1998) Proc. IEEE , vol.86 , Issue.11 , pp. 2278-2324
- Lecun, Y.¹ Bottou, L.² Bengio, Y.³ Haffner, P.⁴

16
- 83455255740
- Spectral vs. Spectro-temporal features for acoustic event detection
- New Paltz, NY, USA, Oct.
- C. V. Cotton and D. P.W. Ellis, "Spectral vs. spectro-temporal features for acoustic event detection," in Proc. IEEE Workshop Appl. Signal Process. Audio Acoust., New Paltz, NY, USA, Oct. 2011, pp. 69-72
- (2011) Proc. IEEE Workshop Appl. Signal Process. Audio Acoust. , pp. 69-72
- Cotton, C.V.¹ Ellis, D.P.W.²

17
- 84913580340
- A dataset and taxonomy for urban sound research
- Multimedia, Orlando, FL, USA, Nov.
- J. Salamon, C. Jacoby, and J. P. Bello, "A dataset and taxonomy for urban sound research," in Proc. 22nd ACM Int. Conf. Multimedia, Orlando, FL, USA, Nov. 2014, pp. 1041-1044
- (2014) Proc. 22nd ACM Int. Conf. , pp. 1041-1044
- Salamon, J.¹ Jacoby, C.² Bello, J.P.³

18
- 84962792164
- ESC: Dataset for environmental sound classification
- Brisbane, Australia, Oct.
- K. J. Piczak, "ESC: Dataset for environmental sound classification," in Proc. 23rd ACM Int. Conf. Multimedia, Brisbane, Australia, Oct. 2015, pp. 1015-1018
- (2015) Proc. 23rd ACM Int. Conf. Multimedia , pp. 1015-1018
- Piczak, K.J.¹

19
- 85043724952
- Accessed on: Aug. 10
- A. Mesaros, E. Fagerlund, A. Hiltunen, T. Heittola, and T. Virtanen, "TUT sound events 2016, development dataset," 2016. [Online]. Available: http://dx.doi.org/10.5281/zenodo.45759. Accessed on: Aug. 10, 2016
- (2016) TUT Sound Events 2016, Development Dataset , vol.2016
- Mesaros, A.¹ Fagerlund, E.² Hiltunen, A.³ Heittola, T.⁴ Virtanen, T.⁵

20
- 84876231242
- ImageNet classification with deep convolutional neural networks
- A. Krizhevsky, I. Sutskever, and G. Hinton, "ImageNet classification with deep convolutional neural networks," in Proc. Adv. Neural Inform. Process. Syst., 2012, pp. 1097-1105
- (2012) Proc. Adv. Neural Inform. Process. Syst. , pp. 1097-1105
- Krizhevsky, A.¹ Sutskever, I.² Hinton, G.³

21
- 84945900998
- Best practices for convolutional neural networks applied to visual document analysis
- Edinburgh, U.K., Aug.
- P. Y. Simard, D. Steinkraus, and J. C. Platt, "Best practices for convolutional neural networks applied to visual document analysis," in Proc. Int. Conf. Document Anal. Recognit., Edinburgh, U.K., Aug. 2003, vol. 3, pp. 958-962
- (2003) Proc. Int. Conf. Document Anal. Recognit. , vol.3 , pp. 958-962
- Simard, P.Y.¹ Steinkraus, D.² Platt, J.C.³

22
- 84996516893
- A software framework for musical data augmentation
- Malaga, Spain, Oct.
- B. McFee, E. Humphrey, and J. Bello, "A software framework for musical data augmentation," in Proc. 16th Int. Soc. Music Inf. Retrieval Conf., Malaga, Spain, Oct. 2015, pp. 248-254
- (2015) Proc. 16th Int. Soc. Music Inf. Retrieval Conf. , pp. 248-254
- McFee, B.¹ Humphrey, E.² Bello, J.³

23
- 84973278436
- Recurrent neural networks for polyphonic sound event detection in real life recordings
- Shanghai, China, Mar.
- G. Parascandolo, H. Huttunen, and T. Virtanen, "Recurrent neural networks for polyphonic sound event detection in real life recordings," in Proc. Int. Conf. Acoust., Speech, Signal Process., Shanghai, China, Mar. 2016, pp. 6440-6444
- (2016) Proc. Int. Conf. Acoust., Speech, Signal Process. , pp. 6440-6444
- Parascandolo, G.¹ Huttunen, H.² Virtanen, T.³

24
- 85019537868
- ESSENTIA: An audio analysis library for music information retrieval
- Curitiba, Brazil, Nov.
- D. Bogdanov et al. "ESSENTIA: An audio analysis library for music information retrieval," in Proc. 14th Int. Soc. Music Inf. Retrieval Conf., Curitiba, Brazil, Nov. 2013, pp. 493-498
- (2013) Proc. 14th Int. Soc. Music Inf. Retrieval Conf. , pp. 493-498
- Bogdanov, D.¹

25
- 84904136037
- Large-scale machine learning with stochastic gradient descent
- Paris, France, Aug.
- L. Bottou, "Large-scale machine learning with stochastic gradient descent," in Proc. 19th Int. Conf. Comput. Statist., Paris, France, Aug. 2010, pp. 177-186. [Online]. Available: http://dx.doi.org/10.1007/978-3-7908-2604-3-16
- (2010) Proc. 19th Int. Conf. Comput. Statist. , pp. 177-186
- Bottou, L.¹

26
- 84904163933
- Dropout: A simple way to prevent neural networks from overfitting
- N. Srivastava, G. E. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov, "Dropout: A simple way to prevent neural networks from overfitting," J. Mach. Learning Res., vol. 15, no. 1, pp. 1929-1958, 2014
- (2014) J. Mach. Learning Res. , vol.15 , Issue.1 , pp. 1929-1958
- Srivastava, N.¹ Hinton, G.E.² Krizhevsky, A.³ Sutskever, I.⁴ Salakhutdinov, R.⁵

27
- 84973384984
- S. Dieleman et al. "Lasagne: First release," 2015. [Online]. Available: https://github.com/Lasagne/Lasagne
- (2015) Lasagne: First Release
- Dieleman, S.¹

28
- 85015182280
- B. McFee and E. J. Humphrey, "Pescador: 0.1.0," 2015. [Online]. Available: http://dx.doi.org/10.5281/zenodo.32468
- (2015) Pescador: 0.1.0
- McFee, B.¹ Humphrey, E.J.²

29
- 85015249132
- Dolby Labortories, Inc
- Dolby Labortories, Inc., "Standards and practices for authoring Dolby Digital and Dolby E bitstreams," 2002
- (2002) Standards and Practices for Authoring Dolby Digital and Dolby e Bitstreams

30
- 85015218858
- Accessed on: Aug. 12
- "Icecast streaming media server forum," [Online]. Available: http://icecast.imux.net/viewtopic.phpt=3462. Accessed on: Aug. 12, 2016
- (2016) Icecast Streaming Media Server Forum

31
- 85066080965
- JAMS: A JSON annotated music specification for reproducible MIR research
- Taipei, Taiwan, Oct.
- E. J. Humphrey, J. Salamon, O. Nieto, J. Forsyth, R. Bittner, and J. P. Bello, "JAMS: A JSON annotated music specification for reproducible MIR research," in Proc. 15th Int. Soc. Music Inf. Retrieval Conf., Taipei, Taiwan, Oct. 2014, pp. 591-596
- (2014) Proc. 15th Int. Soc. Music Inf. Retrieval Conf. , pp. 591-596
- Humphrey, E.J.¹ Salamon, J.² Nieto, O.³ Forsyth, J.⁴ Bittner, R.⁵ Bello, J.P.⁶

32
- 85015250879
- Pump up the JAMS: V0.2 and beyond
- New York University, New York, NY, USA, Oct. unpublished
- B. McFee et al., "Pump up the JAMS: V0.2 and beyond," Music and Audio Research Laboratory, New York University, New York, NY, USA, Oct. 2015, unpublished.
- (2015) Music and Audio Research Laboratory
- McFee, B.¹

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.