메뉴 건너뛰기




Volumn 25, Issue 1, 2017, Pages 208-221

Deep Convolutional Neural Networks for Predominant Instrument Recognition in Polyphonic Music

Author keywords

Convolutional neural networks; deep learning; instrument recognition; multi layer neural network; music information retrieval

Indexed keywords

ACTIVATION ANALYSIS; AUDIO RECORDINGS; CHEMICAL ACTIVATION; CONVOLUTION; FACTOR ANALYSIS; INFORMATION RETRIEVAL; NETWORK ARCHITECTURE; NETWORK LAYERS; NEURAL NETWORKS; SIGNAL ANALYSIS; SOURCE SEPARATION;

EID: 85007475072     PISSN: 23299290     EISSN: None     Source Type: Journal    
DOI: 10.1109/TASLP.2016.2632307     Document Type: Article
Times cited : (210)

References (52)
  • 1
    • 0033690881 scopus 로고    scopus 로고
    • Musical instrument recognition using cepstral coefficients and temporal features
    • A. Eronen and A. Klapuri, "Musical instrument recognition using cepstral coefficients and temporal features," in Proc. 2000 IEEE Int. Conf. Acoust., Speech Signal Process., 2000, vol. 2, pp. II753-II756.
    • (2000) Proc. 2000 IEEE Int. Conf. Acoust., Speech Signal Process , vol.2 , pp. II753-II756
    • Eronen, A.1    Klapuri, A.2
  • 6
    • 84873616077 scopus 로고    scopus 로고
    • Musical instrument recognition in polyphonic audio using source-filter model for sound separation
    • T. Heittola, A. Klapuri, and T. Virtanen, "Musical instrument recognition in polyphonic audio using source-filter model for sound separation," in Proc. Int. Soc. Music Inf. Retrieval Conf., 2009, pp. 327-332.
    • (2009) Proc. Int. Soc. Music Inf. Retrieval Conf , pp. 327-332
    • Heittola, T.1    Klapuri, A.2    Virtanen, T.3
  • 7
    • 33846220762 scopus 로고    scopus 로고
    • Instrument identification in polyphonic music: Feature weighting to minimize influence of sound overlaps
    • T. Kitahara, M. Goto, K. Komatani, T. Ogata, and H. G. Okuno, "Instrument identification in polyphonic music: Feature weighting to minimize influence of sound overlaps," EURASIP J. Appl. Signal Process., vol. 2007, no. 1, pp. 155-155, 2007.
    • (2007) EURASIP J. Appl. Signal Process. , vol.2007 , Issue.1 , pp. 155
    • Kitahara, T.1    Goto, M.2    Komatani, K.3    Ogata, T.4    Okuno, H.G.5
  • 8
    • 84905230347 scopus 로고    scopus 로고
    • A novel cepstral representation for timbre modeling of sound sources in polyphonic mixtures
    • Z. Duan, B. Pardo, and L. Daudet, "A novel cepstral representation for timbre modeling of sound sources in polyphonic mixtures," in Proc. 2014 IEEE Int. Conf. Acoust., Speech Signal Process., 2014, pp. 7495-7499.
    • (2014) Proc. 2014 IEEE Int. Conf. Acoust., Speech Signal Process , pp. 7495-7499
    • Duan, Z.1    Pardo, B.2    Daudet, L.3
  • 10
    • 84930630277 scopus 로고    scopus 로고
    • Deep learning
    • Y. LeCun, Y. Bengio, and G. Hinton, "Deep learning," Nature, vol. 521, no. 7553, pp. 436-444, 2015.
    • (2015) Nature , vol.521 , Issue.7553 , pp. 436-444
    • LeCun, Y.1    Bengio, Y.2    Hinton, G.3
  • 11
    • 84903724014 scopus 로고    scopus 로고
    • Deep learning: Methods and applications
    • L. Deng and D. Yu, "Deep learning: Methods and applications," Found. Trends Signal Process., vol. 7, no. 3-4, pp. 197-387, 2014.
    • (2014) Found. Trends Signal Process. , vol.7 , Issue.3-4 , pp. 197-387
    • Deng, L.1    Yu, D.2
  • 13
    • 84906237242 scopus 로고    scopus 로고
    • Investigation of recurrentneural-network architectures and learning methods for spoken language understanding
    • G. Mesnil, X. He, L. Deng, and Y. Bengio, "Investigation of recurrentneural-network architectures and learning methods for spoken language understanding," in Proc. Annu. Conf. Int. Speech Commun. Assoc., 2013, pp. 3771-3775.
    • (2013) Proc. Annu. Conf. Int. Speech Commun. Assoc , pp. 3771-3775
    • Mesnil, G.1    He, X.2    Deng, L.3    Bengio, Y.4
  • 15
    • 0002859310 scopus 로고
    • Learning algorithms for classification: A comparison on handwritten digit recognition
    • Y. LeCun et al., "Learning algorithms for classification: A comparison on handwritten digit recognition," Neural Netw. : Stat. Mech. Perspect., vol. 261, pp. 261-276, 1995.
    • (1995) Neural Netw. : Stat. Mech. Perspect. , vol.261 , pp. 261-276
    • LeCun, Y.1
  • 16
    • 40849143505 scopus 로고    scopus 로고
    • Handwritten digit recognition using convolutional neural networks andGabor filters
    • A. Calderón, S. Roa, and J. Victorino, "Handwritten digit recognition using convolutional neural networks andGabor filters," in Proc. Int. Congr. Comput. Intell, 2003.
    • (2003) Proc. Int. Congr. Comput. Intell
    • Calderón, A.1    Roa, S.2    Victorino, J.3
  • 17
    • 83655163714 scopus 로고    scopus 로고
    • A novel hybrid CNN-SVM classifier for recognizing handwritten digits
    • X.-X. Niu and C. Y. Suen, "A novel hybrid CNN-SVM classifier for recognizing handwritten digits," Pattern Recog., vol. 45, no. 4, pp. 1318-1325, 2012.
    • (2012) Pattern Recog. , vol.45 , Issue.4 , pp. 1318-1325
    • Niu, X.-X.1    Suen, C.Y.2
  • 20
    • 85032751458 scopus 로고    scopus 로고
    • Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups
    • Nov.
    • G. Hinton et al., "Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups," IEEE Signal Process. Mag., vol. 29, no. 6, pp. 82-97, Nov. 2012.
    • (2012) IEEE Signal Process. Mag. , vol.29 , Issue.6 , pp. 82-97
    • Hinton, G.1
  • 21
    • 84864146684 scopus 로고    scopus 로고
    • Temporal pooling and multiscale learning for automatic annotation and ranking of music audio
    • P. Hamel, S. Lemieux, Y. Bengio, and D. Eck, "Temporal pooling and multiscale learning for automatic annotation and ranking of music audio," in Proc. Int. Soc. Music Inf. Retrieval Conf., 2011, pp. 729-734.
    • (2011) Proc. Int. Soc. Music Inf. Retrieval Conf , pp. 729-734
    • Hamel, P.1    Lemieux, S.2    Bengio, Y.3    Eck, D.4
  • 23
    • 84873577775 scopus 로고    scopus 로고
    • Rethinking automatic chord recognition with convolutional neural networks
    • E. J. Humphrey and J. P. Bello, "Rethinking automatic chord recognition with convolutional neural networks," in Proc. 2012 11th Int. Conf. Mach. Learn. Appl., 2012, vol. 2, pp. 357-362.
    • (2012) Proc. 2012 11th Int. Conf. Mach. Learn. Appl , vol.2 , pp. 357-362
    • Humphrey, E.J.1    Bello, J.P.2
  • 25
    • 85007420949 scopus 로고    scopus 로고
    • Boundary detection inmusic structure analysis using convolutional neural networks
    • K. Ullrich, J. Schlüter, and T. Grill, "Boundary detection inmusic structure analysis using convolutional neural networks," in Proc. Int. Soc. Music Inf. Retrieval Conf., 2014, pp. 417-422.
    • (2014) Proc. Int. Soc. Music Inf. Retrieval Conf , pp. 417-422
    • Ullrich, K.1    Schlüter, J.2    Grill, T.3
  • 26
    • 85009059267 scopus 로고    scopus 로고
    • Music boundary detection using neural networks on combined features and two-level annotations
    • Malaga, Spain
    • T. Grill and J. Schlüter, "Music boundary detection using neural networks on combined features and two-level annotations," in Proc. 16th Int. Soc. Music Inf. Retr. Conf., Malaga, Spain, 2015.
    • (2015) Proc. 16th Int. Soc. Music Inf. Retr. Conf
    • Grill, T.1    Schlüter, J.2
  • 33
    • 84906489074 scopus 로고    scopus 로고
    • Visualizing and understanding convolutional networks
    • M. D. Zeiler and R. Fergus, "Visualizing and understanding convolutional networks," in Proc. Eur. Conf. Comput. Vis., 2014, pp. 818-833.
    • (2014) Proc. Eur. Conf. Comput. Vis , pp. 818-833
    • Zeiler, M.D.1    Fergus, R.2
  • 38
    • 84992462621 scopus 로고    scopus 로고
    • 2D fake fingerprint detection based on improved CNN and local descriptors for smart phone
    • Y. Zhang, B. Zhou, H. Wu, and C. Wen, "2D fake fingerprint detection based on improved CNN and local descriptors for smart phone," in Proc. Chin. Conf. Biometric Recog., 2016, pp. 655-662.
    • (2016) Proc. Chin. Conf. Biometric Recog , pp. 655-662
    • Zhang, Y.1    Zhou, B.2    Wu, H.3    Wen, C.4
  • 39
    • 84862277874 scopus 로고    scopus 로고
    • Understanding the difficulty of training deep feedforward neural networks
    • X. Glorot and Y. Bengio, "Understanding the difficulty of training deep feedforward neural networks," in Proc. Int. Conf. Artif. Intell. Stat., 2010, pp. 249-256.
    • (2010) Proc. Int. Conf. Artif. Intell. Stat , pp. 249-256
    • Glorot, X.1    Bengio, Y.2
  • 41
    • 77956509090 scopus 로고    scopus 로고
    • Rectified linear units improve restricted Boltzmann machines
    • V. Nair and G. E. Hinton, "Rectified linear units improve restricted Boltzmann machines," in Proc. 27th Int. Conf. Mach. Learn., 2010, pp. 807-814.
    • (2010) Proc. 27th Int. Conf. Mach. Learn , pp. 807-814
    • Nair, V.1    Hinton, G.E.2
  • 42
    • 84893676344 scopus 로고    scopus 로고
    • Rectifier nonlinearities improve neural network acoustic models
    • A. L. Maas, A. Y. Hannun, and A. Y. Ng, "Rectifier nonlinearities improve neural network acoustic models," in Proc. Int. Conf. Mach. Learn., 2013, vol. 30, p. 1.
    • (2013) Proc. Int. Conf. Mach. Learn , vol.30 , pp. 1
    • Maas, A.L.1    Hannun, A.Y.2    Ng, A.Y.3
  • 43
    • 84973911419 scopus 로고    scopus 로고
    • Delving deep into rectifiers: Surpassing human-level performance on ImageNet classification
    • K. He, X. Zhang, S. Ren, and J. Sun, "Delving deep into rectifiers: Surpassing human-level performance on ImageNet classification," in Proc. IEEE Int. Conf. Comput. Vis., 2015, pp. 1026-1034.
    • (2015) Proc. IEEE Int. Conf. Comput. Vis , pp. 1026-1034
    • He, K.1    Zhang, X.2    Ren, S.3    Sun, J.4
  • 45
    • 84873436256 scopus 로고    scopus 로고
    • A comparison of sound segregation techniques for predominant instrument recognition in musical audio signals
    • J. J. Bosch, J. Janer, F. Fuhrmann, and P. Herrera, "A comparison of sound segregation techniques for predominant instrument recognition in musical audio signals," in Proc. Int. Soc. Music Inf. Retrieval Conf., 2012, pp. 559-564.
    • (2012) Proc. Int. Soc. Music Inf. Retrieval Conf , pp. 559-564
    • Bosch, J.J.1    Janer, J.2    Fuhrmann, F.3    Herrera, P.4
  • 46
    • 85139270516 scopus 로고    scopus 로고
    • Polyphonic instrument recognition for exploring semantic similarities in music
    • F. Fuhrmann and P. Herrera, "Polyphonic instrument recognition for exploring semantic similarities in music," in Proc. 13th Int. Conf. Digit. Audio Effects, 2010, pp. 1-8.
    • (2010) Proc. 13th Int. Conf. Digit. Audio Effects , pp. 1-8
    • Fuhrmann, F.1    Herrera, P.2
  • 47
    • 85054435084 scopus 로고
    • Neural network ensembles, cross validation, and active learning
    • A. Krogh et al., "Neural network ensembles, cross validation, and active learning," Adv. Neural Inf. Process. Syst., vol. 7, pp. 231-238, 1995.
    • (1995) Adv. Neural Inf. Process. Syst. , vol.7 , pp. 231-238
    • Krogh, A.1
  • 48
    • 77949581963 scopus 로고    scopus 로고
    • Harmonic and percussive sound separation and its application toMIR-related tasks
    • Berlin, Germany: Springer
    • N. Ono et al., "Harmonic and percussive sound separation and its application toMIR-related tasks," in Advances in Music Information Retrieval. Berlin, Germany: Springer, 2010, pp. 213-236.
    • (2010) Advances in Music Information Retrieval , pp. 213-236
    • Ono, N.1
  • 49
    • 70350477898 scopus 로고    scopus 로고
    • Music onset detection combining energy-based and pitch-based approaches
    • R. Zhou and J. D. Reiss, "Music onset detection combining energy-based and pitch-based approaches," in Proc. MIREX Audio Onset Detect. Contest, 2007.
    • (2007) Proc. MIREX Audio Onset Detect. Contest
    • Zhou, R.1    Reiss, J.D.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.