SCOPUS 정보 검색 플랫폼

ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings

Volumn 2016-May, Issue , 2016, Pages 6440-6444

Recurrent neural networks for polyphonic sound event detection in real life recordings

(3) Parascandolo, Giambattista a Huttunen, Heikki a Virtanen, Tuomas a

a TAMPERE UNIVERSITY OF TECHNOLOGY (Finland)

Author keywords

bidirectional LSTM; deep learning; polyphonic sound event detection; Recurrent neural network

Indexed keywords

EID: 84973278436 PISSN: 15206149 EISSN: None Source Type: Conference Proceeding
DOI: 10.1109/ICASSP.2016.7472917 Document Type: Conference Paper

Times cited : (309)

References (27)

1
- 33750550452
- Automatic surveillance of the acoustic activity in our living environment
- Aki Härmä, Martin F McKinney, and Janto Skowronek, "Automatic surveillance of the acoustic activity in our living environment, " in IEEE International Conference on Multimedia and Expo (ICME), 2005.
- (2005) IEEE International Conference on Multimedia and Expo (ICME)
- Härmä, A.¹ McKinney, M.F.² Skowronek, J.³

2
- 68149163531
- Environmental sound recognition with time-frequency audio features
- Selina Chu, Shrikanth Narayanan, and CC Jay Kuo, "Environmental sound recognition with time-frequency audio features, " IEEE Transactions on Audio, Speech, and Language Processing, vol. 17, no. 6, pp. 1142-1158, 2009.
- (2009) IEEE Transactions on Audio, Speech, and Language Processing , vol.17 , Issue.6 , pp. 1142-1158
- Chu, S.¹ Narayanan, S.² Jay Kuo, C.C.³

3
- 44249121489
- Audio keywords generation for sports video analysis
- Min Xu, Changsheng Xu, Lingyu Duan, Jesse S Jin, and Suhuai Luo, "Audio keywords generation for sports video analysis, " ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM), vol. 4, no. 2, pp. 11, 2008.
- (2008) ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM) , vol.4 , Issue.2 , pp. 11
- Xu, M.¹ Xu, C.² Duan, L.³ Jin, J.S.⁴ Luo, S.⁵

4
- 84887056523
- Context-dependent sound event detection
- Toni Heittola, Annamaria Mesaros, Antti Eronen, and Tuomas Virtanen, "Context-dependent sound event detection, " EURASIP Journal on Audio, Speech, and Music Processing, vol. 2013, no. 1, pp. 1-13, 2013.
- (2013) EURASIP Journal on Audio, Speech, and Music Processing , vol.2013 , Issue.1 , pp. 1-13
- Heittola, T.¹ Mesaros, A.² Eronen, A.³ Virtanen, T.⁴

5
- 79959754926
- Acoustic event detection in real life recordings
- Annamaria Mesaros, Toni Heittola, Antti Eronen, and Tuomas Virtanen, "Acoustic event detection in real life recordings, " in 18th European Signal Processing Conference, 2010, pp. 1267-1271.
- (2010) 18th European Signal Processing Conference , pp. 1267-1271
- Mesaros, A.¹ Heittola, T.² Eronen, A.³ Virtanen, T.⁴

6
- 84890450206
- Supervised model training for overlapping sound events based on unsupervised source separation
- Toni Heittola, Annamaria Mesaros, Tuomas Virtanen, and Moncef Gabbouj, "Supervised model training for overlapping sound events based on unsupervised source separation, " in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2013, pp. 8677-8681.
- (2013) IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) , pp. 8677-8681
- Heittola, T.¹ Mesaros, A.² Virtanen, T.³ Gabbouj, M.⁴

7
- 84865687106
- NMF-based environmental sound source separation using time-variant gain features
- Satoshi Innami and Hiroyuki Kasai, "NMF-based environmental sound source separation using time-variant gain features, " Computers & Mathematics with Applications, vol. 64, no. 5, pp. 1333-1342, 2012.
- (2012) Computers & Mathematics with Applications , vol.64 , Issue.5 , pp. 1333-1342
- Innami, S.¹ Kasai, H.²

8
- 84880541040
- Realtime detection of overlapping sound events with non-negative matrix factorization
- Springer
- Arnaud Dessein, Arshia Cont, and Guillaume Lemaitre, "Realtime detection of overlapping sound events with non-negative matrix factorization, " in Matrix Information Geometry, pp. 341-371. Springer, 2013.
- (2013) Matrix Information Geometry , pp. 341-371
- Dessein, A.¹ Cont, A.² Lemaitre, G.³

9
- 84893521081
- Sound event detection using non-negative dictionaries learned from annotated overlapping events
- Onur Dikmen and Annamaria Mesaros, "Sound event detection using non-negative dictionaries learned from annotated overlapping events, " in IEEEWorkshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), 2013.
- (2013) IEEEWorkshop on Applications of Signal Processing to Audio and Acoustics (WASPAA)
- Dikmen, O.¹ Mesaros, A.²

10
- 84946032854
- Sound event detection in real life recordings using coupled matrix factorization of spectral representations and class activity annotations
- Annamaria Mesaros, Toni Heittola, Onur Dikmen, and Tuomas Virtanen, "Sound event detection in real life recordings using coupled matrix factorization of spectral representations and class activity annotations, " in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2015, pp. 606-618.
- (2015) IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) , pp. 606-618
- Mesaros, A.¹ Heittola, T.² Dikmen, O.³ Virtanen, T.⁴

11
- 84876157594
- Overlapping sound event recognition using local spectrogram features and the generalised hough transform
- Jonathan Dennis, Huy Dat Tran, and Eng Siong Chng, "Overlapping sound event recognition using local spectrogram features and the generalised hough transform, " Pattern Recognition Letters, vol. 34, no. 9, pp. 1085-1093, 2013.
- (2013) Pattern Recognition Letters , vol.34 , Issue.9 , pp. 1085-1093
- Dennis, J.¹ Dat Tran, H.² Siong Chng, E.³

12
- 84951103511
- Polyphonic sound event detection using multi label deep neural networks
- Emre Cakir, Toni Heittola, Heikki Huttunen, and Tuomas Virtanen, "Polyphonic sound event detection using multi label deep neural networks, " in IEEE International Joint Conference on Neural Networks (IJCNN), 2015.
- (2015) IEEE International Joint Conference on Neural Networks (IJCNN)
- Cakir, E.¹ Heittola, T.² Huttunen, H.³ Virtanen, T.⁴

13
- 0031573117
- Long short-term memory
- Sepp Hochreiter and Jürgen Schmidhuber, "Long short-term memory, " Neural computation, vol. 9, no. 8, pp. 1735-1780, 1997.
- (1997) Neural Computation , vol.9 , Issue.8 , pp. 1735-1780
- Hochreiter, S.¹ Schmidhuber, J.²

14
- 27744588611
- Framewise phoneme classification with bidirectional LSTM and other neural network architectures
- Alex Graves and Jürgen Schmidhuber, "Framewise phoneme classification with bidirectional LSTM and other neural network architectures, " Neural Networks, vol. 18, no. 5, pp. 602-610, 2005.
- (2005) Neural Networks , vol.18 , Issue.5 , pp. 602-610
- Graves, A.¹ Schmidhuber, J.²

15
- 84890543083
- Speech recognition with deep recurrent neural networks
- Alex Graves, Abdel-rahman Mohamed, and Geoffrey Hinton, "Speech recognition with deep recurrent neural networks, " in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2013, pp. 6645-6649.
- (2013) IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) , pp. 6645-6649
- Graves, A.¹ Mohamed, A.² Hinton, G.³

16
- 84863771261
- Universal onset detection with bidirectional long short-term memory neural networks
- Florian Eyben, Sebastian Böck, Björn Schuller, and Alex Graves, "Universal onset detection with bidirectional long short-term memory neural networks., " in International Society for Music Information Retrieval Conference (ISMIR), 2010, pp. 589-594.
- (2010) International Society for Music Information Retrieval Conference (ISMIR) , pp. 589-594
- Eyben, F.¹ Böck, S.² Schuller, B.³ Graves, A.⁴

17
- 84867593805
- Polyphonic piano note transcription with recurrent neural networks
- Sebastian Böck and Markus Schedl, "Polyphonic piano note transcription with recurrent neural networks, " in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2012, pp. 121-124.
- (2012) IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) , pp. 121-124
- Böck, S.¹ Schedl, M.²

18
- 0031268931
- Bidirectional recurrent neural networks
- Mike Schuster and Kuldip K Paliwal, "Bidirectional recurrent neural networks, " IEEE Transactions on Signal Processing, vol. 45, no. 11, pp. 2673-2681, 1997.
- (1997) IEEE Transactions on Signal Processing , vol.45 , Issue.11 , pp. 2673-2681
- Schuster, M.¹ Paliwal, K.K.²

19
- 0028392483
- Learning long-term dependencies with gradient descent is difficult
- Yoshua Bengio, Patrice Simard, and Paolo Frasconi, "Learning long-term dependencies with gradient descent is difficult, " IEEE Transactions on Neural Networks, vol. 5, no. 2, pp. 157-166, 1994.
- (1994) IEEE Transactions on Neural Networks , vol.5 , Issue.2 , pp. 157-166
- Bengio, Y.¹ Simard, P.² Frasconi, P.³

20
- 64849110608
- A novel connectionist system for unconstrained handwriting recognition
- Alex Graves, Marcus Liwicki, Santiago Fernández, Roman Bertolami, Horst Bunke, and Jürgen Schmidhuber, "A novel connectionist system for unconstrained handwriting recognition, " IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 31, no. 5, pp. 855-868, 2009.
- (2009) IEEE Transactions on Pattern Analysis and Machine Intelligence , vol.31 , Issue.5 , pp. 855-868
- Graves, A.¹ Liwicki, M.² Fernández, S.³ Bertolami, R.⁴ Bunke, H.⁵ Schmidhuber, J.⁶

21
- 0024753593
- Speech recognition using noise-adaptive prototypes
- Arthur Nádas, David Nahamoo, Michael Picheny, et al., "Speech recognition using noise-adaptive prototypes, " IEEE Transactions on Acoustics, Speech and Signal Processing, vol. 37, no. 10, pp. 1495-1503, 1989.
- (1989) IEEE Transactions on Acoustics, Speech and Signal Processing , vol.37 , Issue.10 , pp. 1495-1503
- Nádas, A.¹ Nahamoo, D.² Picheny, M.³

22
- 84973376414
- Exploring data augmentation for improved singing voice detection with neural networks
- Jan Schlüter and Thomas Grill, "Exploring data augmentation for improved singing voice detection with neural networks, " in International Society for Music Information Retrieval Conference (ISMIR), 2015.
- (2015) International Society for Music Information Retrieval Conference (ISMIR)
- Schlüter, J.¹ Grill, T.²

23
- 84996516893
- A software framework for musical data augmentation
- Brian McFee, Eric J. Humphrey, and Juan P. Bello, "A software framework for musical data augmentation, " in International Society for Music Information Retrieval Conference (ISMIR), 2015.
- (2015) International Society for Music Information Retrieval Conference (ISMIR)
- McFee, B.¹ Humphrey, E.J.² Bello, J.P.³

24
- 84863806337
- Audio context recognition using audio event histograms
- Toni Heittola, Annamaria Mesaros, Antti Eronen, and Tuomas Virtanen, "Audio context recognition using audio event histograms, " in Proc. of the 18th European Signal Processing Conference (EUSIPCO), 2010, pp. 1272-1276.
- (2010) Proc. of the 18th European Signal Processing Conference (EUSIPCO) , pp. 1272-1276
- Heittola, T.¹ Mesaros, A.² Eronen, A.³ Virtanen, T.⁴

25
- 0025503558
- Backpropagation through time: What it does and how to do it
- Paul J Werbos, "Backpropagation through time: what it does and how to do it, " Proceedings of the IEEE, vol. 78, no. 10, pp. 1550-1560, 1990.
- (1990) Proceedings of the IEEE , vol.78 , Issue.10 , pp. 1550-1560
- Werbos, P.J.¹

26
- 84893343292
- Lecture 6. 5-rmsprop: Divide the gradient by a running average of its recent magnitude
- Tijmen Tieleman and Geoffrey Hinton, "Lecture 6. 5-rmsprop: Divide the gradient by a running average of its recent magnitude, " COURSERA: Neural Networks for Machine Learning, vol. 4, 2012.
- (2012) COURSERA: Neural Networks for Machine Learning , vol.4
- Tieleman, T.¹ Hinton, G.²

27
- 84930639546
- Introducing currennt: The Munich opensource CUDA recurrent neural network toolkit
- Felix Weninger, "Introducing currennt: The Munich opensource CUDA recurrent neural network toolkit, " Journal of Machine Learning Research, vol. 16, pp. 547-551, 2015.
- (2015) Journal of Machine Learning Research , vol.16 , pp. 547-551
- Weninger, F.¹

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.