SCOPUS 정보 검색 플랫폼

IEEE Journal on Selected Topics in Signal Processing

Volumn 5, Issue 6, 2011, Pages 1159-1169

Transcribing multi-instrument polyphonic music with hierarchical eigeninstruments

(2) Grindlay, Graham a Ellis, Daniel P W a

a Columbia University ^* (United States)

Author keywords

Eigeninstruments; Music; Non negative matrix factorization (NMF); Polyphonic transcription; Subspace

Indexed keywords

EIGENINSTRUMENTS; LEVELS OF ABSTRACTION; LINEAR MANIFOLD; MODEL PARAMETERS; MUSIC; MUSIC RECORDING; NON-NEGATIVE MATRIX FACTORIZATION (NMF); NON-NEGATIVE MATRIX FACTORIZATION ALGORITHMS; POLYPHONIC MUSIC; PRIOR KNOWLEDGE; PROBABILISTIC MODELS; SINGLE-CHANNEL; SUBSPACE;

BLIND SOURCE SEPARATION; FACTORIZATION; MATRIX ALGEBRA; MIXTURES; TRANSCRIPTION;

INSTRUMENTS;

EID: 80053031254 PISSN: 19324553 EISSN: None Source Type: Journal
DOI: 10.1109/JSTSP.2011.2162395 Document Type: Article

Times cited : (59)

References (44)

1
- 0028561099
- Positive matrix factorization: A non-negative factor model with optimal utilization of error estimates of data values
- P. Paatero and U. Tapper, "Positive matrix factorization: A non-negative factor model with optimal utilization of error estimates of data values," Environmetrics, vol. 5, no. 2, pp. 111-126, 1994.
- (1994) Environmetrics , vol.5 , Issue.2 , pp. 111-126
- Paatero, P.¹ Tapper, U.²

2
- 0033592606
- Learning the parts of objects by nonnegative matrix factorization
- D. D. Lee and H. S. Seung, "Learning the parts of objects by nonnegative matrix factorization," Nature, vol. 401, no. 6755, pp. 788-791, 1999.
- (1999) Nature , vol.401 , Issue.6755 , pp. 788-791
- Lee, D.D.¹ Seung, H.S.²

3
- 0002629270
- Maximum likelihood from incomplete data via the em algorithm
- A. P. Dempster, N. M. Laird, and D. B. Rubin, "Maximum likelihood from incomplete data via the EM algorithm," J. R. Statist. Soc., vol. 39, no. 1, pp. 1-38, 1977.
- (1977) J. R. Statist. Soc. , vol.39 , Issue.1 , pp. 1-38
- Dempster, A.P.¹ Laird, N.M.² Rubin, D.B.³

4
- 84898964201
- Algorithms for non-negative matrix factorization
- D. D. Lee and H. S. Seung, "Algorithms for non-negative matrix factorization," in Proc. Adv. Neural Inf. Process. Syst., 2001, pp. 556-562.
- (2001) Proc. Adv. Neural Inf. Process. Syst. , pp. 556-562
- Lee, D.D.¹ Seung, H.S.²

5
- 84900510076
- Non-negative matrix factorization with sparseness constraints
- P. O. Hoyer, "Non-negative matrix factorization with sparseness constraints," J. Mach. Learn. Res., vol. 5, pp. 1457-1469, 2004.
- (2004) J. Mach. Learn. Res. , vol.5 , pp. 1457-1469
- Hoyer, P.O.¹

6
- 10944227316
- Sparse coding and NMF
- 2004 IEEE International Joint Conference on Neural Networks - Proceedings
- J. Eggert and E. Körner, "Sparse coding and NMF," in Proc. IEEE Int. Joint Conf. Neural Netw., 2004, vol. 4, pp. 2529-2533. (Pubitemid 40011434)
- (2004) IEEE International Conference on Neural Networks - Conference Proceedings , vol.4 , pp. 2529-2533
- Eggert, J.¹ Korner, E.²

7
- 47649133016
- Probabilistic latent variable models as non-negative factorizations
- Article ID 947438
- M. Shashanka, B. Raj, and P. Smaragdis, "Probabilistic latent variable models as non-negative factorizations," Comput. Intell. Neurosci., vol. 2008, 2008, Article ID 947438.
- (2008) Comput. Intell. Neurosci. , vol.2008
- Shashanka, M.¹ Raj, B.² Smaragdis, P.³

8
- 50249152311
- Monaural sound source separation by nonnegative matrix factorization with temporal continuity and sparseness criteria
- Mar
- T. Virtanen, "Monaural sound source separation by nonnegative matrix factorization with temporal continuity and sparseness criteria," IEEE Trans. Audio, Speech, Lang. Process., vol. 15, no. 3, pp. 1066-1074, Mar. 2007.
- (2007) IEEE Trans. Audio, Speech, Lang. Process. , vol.15 , Issue.3 , pp. 1066-1074
- Virtanen, T.¹

9
- 51449111646
- Bayesian extensions to nonnegative matrix factorization for audio signal modeling
- T. Virtanen, A. T. Cemgil, and S. Godsill, "Bayesian extensions to nonnegative matrix factorization for audio signal modeling," in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process., 2008, pp. 1825-1828.
- (2008) Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. , pp. 1825-1828
- Virtanen, T.¹ Cemgil, A.T.² Godsill, S.³

10
- 63249085556
- Nonnegative matrix factorization with the Itakura-Saito divergence. With application to music analysis
- C. Févotte, N. Bertin, and J. L. Durrieu, "Nonnegative matrix factorization with the Itakura-Saito divergence. With application to music analysis," Neural Comput., vol. 21, no. 3, pp. 793-830, 2009.
- (2009) Neural Comput. , vol.21 , Issue.3 , pp. 793-830
- Févotte, C.¹ Bertin, N.² Durrieu, J.L.³

11
- 44949110218
- Single-channel speech separation using sparse non-negative matrix factorization
- M. N. Schmidt and R. K. Olsson, "Single-channel speech separation using sparse non-negative matrix factorization," in Proc. Int. Conf. Spoken Lang. Process., 2006.
- (2006) Proc. Int. Conf. Spoken Lang. Process.
- Schmidt, M.N.¹ Olsson, R.K.²

12
- 84858719009
- A sparse non-parametric approach for single channel separation of known sounds
- P. Smaragdis, M. Shashanka, and B. Raj, "A sparse non-parametric approach for single channel separation of known sounds," in Proc. Adv. Neural Inf. Process. Syst., 2009, pp. 1705-1713.
- (2009) Proc. Adv. Neural Inf. Process. Syst. , pp. 1705-1713
- Smaragdis, P.¹ Shashanka, M.² Raj, B.³

13
- 84945116938
- Non-negative matrix factorization for polyphonic music transcription
- P. Smaragdis and J. C. Brown, "Non-negative matrix factorization for polyphonic music transcription," in Proc. IEEE Workshop Applicat. Signal Process. Audio Acoust., 2003, pp. 177-180.
- (2003) Proc. IEEE Workshop Applicat. Signal Process. Audio Acoust. , pp. 177-180
- Smaragdis, P.¹ Brown, J.C.²

14
- 51449099910
- Harmonic and inharmonic non-negative matrix factorization for polyphonic pitch transcription
- E. Vincent, N. Bertin, and R. Badeau, "Harmonic and inharmonic non-negative matrix factorization for polyphonic pitch transcription," in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process., 2008, pp. 109-112.
- (2008) Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. , pp. 109-112
- Vincent, E.¹ Bertin, N.² Badeau, R.³

15
- 77950138969
- Multi-voice polyphonic music transcription using eigeninstruments
- G. Grindlay and D. P. W. Ellis, "Multi-voice polyphonic music transcription using eigeninstruments," in Proc. IEEE Workshop Applicat. Signal Process. Audio Acoust., 2009, pp. 53-56.
- (2009) Proc. IEEE Workshop Applicat. Signal Process. Audio Acoust. , pp. 53-56
- Grindlay, G.¹ Ellis, D.P.W.²

16
- 80053048831
- A probabilistic subspace model for polyphonic music transcription
- G. Grindlay and D. P. W. Ellis, "A probabilistic subspace model for polyphonic music transcription," in Int. Conf. Music Inf. Retrieval, 2010, pp. 21-26.
- (2010) Int. Conf. Music Inf. Retrieval , pp. 21-26
- Grindlay, G.¹ Ellis, D.P.W.²

17
- 80555136186
- Automatic relevance determination in nonnegative matrix factorization
- V. Y. F. Tan and C. Févotte, "Automatic relevance determination in nonnegative matrix factorization," in Proc. Workshop Signal Process. with Adaptive Sparse Structured Represent., 2009.
- (2009) Proc. Workshop Signal Process. with Adaptive Sparse Structured Represent.
- Tan, V.Y.F.¹ Févotte, C.²

18
- 84863690059
- Separation of drums from polyphonic music using non-negative matrix factorization and support vector machine
- M. Helén and T. Virtanen, "Separation of drums from polyphonic music using non-negative matrix factorization and support vector machine," in Proc. Eur. Signal Process. Conf., 2005.
- (2005) Proc. Eur. Signal Process. Conf.
- Helén, M.¹ Virtanen, T.²

19
- 80052993673
- Monophonic instrument sound segregation by clustering NMF components based on basis similarity and gain disjointness
- K. Murao, M. Nakano, Y. Kitano, N. Ono, and S. Sagayama, "Monophonic instrument sound segregation by clustering NMF components based on basis similarity and gain disjointness," in Proc. Int. Soc. Music Inf. Retrieval Conf., 2010, pp. 375-380.
- (2010) Proc. Int. Soc. Music Inf. Retrieval Conf. , pp. 375-380
- Murao, K.¹ Nakano, M.² Kitano, Y.³ Ono, N.⁴ Sagayama, S.⁵

20
- 77952407197
- Analysis of polyphonic audio using source-filter model and non-negative matrix factorization
- T. Virtanen and A. Klapuri, "Analysis of polyphonic audio using source-filter model and non-negative matrix factorization," in Proc. Adv. Neural Inf. Process. Syst., 2006.
- (2006) Proc. Adv. Neural Inf. Process. Syst.
- Virtanen, T.¹ Klapuri, A.²

21
- 84873616077
- Musical instrument recognition in polyphonic audio using source-filter model for sound separation
- T. Heittola, A. Klapuri, and T. Virtanen, "Musical instrument recognition in polyphonic audio using source-filter model for sound separation," in Proc. Int. Conf. Music Inf. Retrieval, 2009, pp. 327-332.
- (2009) Proc. Int. Conf. Music Inf. Retrieval , pp. 327-332
- Heittola, T.¹ Klapuri, A.² Virtanen, T.³

22
- 76949108729
- Adaptive harmonic spectral decomposition for multiple pitch estimation
- Mar
- E. Vincent, N. Bertin, and R. Badeau, "Adaptive harmonic spectral decomposition for multiple pitch estimation," IEEE Trans. Audio, Speech, Lang. Process., vol. 18, no. 3, pp. 528-537, Mar. 2010.
- (2010) IEEE Trans. Audio, Speech, Lang. Process. , vol.18 , Issue.3 , pp. 528-537
- Vincent, E.¹ Bertin, N.² Badeau, R.³

23
- 76949083547
- Enforcing harmonicity and smoothness in Bayesian non-negative matrix factorization applied to polyphonic music transcription
- Mar
- N. Bertin, R. Badeau, and E. Vincent, "Enforcing harmonicity and smoothness in Bayesian non-negative matrix factorization applied to polyphonic music transcription," IEEE Trans. Audio, Speech, Lang. Process., vol. 18, no. 3, pp. 538-549, Mar. 2010.
- (2010) IEEE Trans. Audio, Speech, Lang. Process. , vol.18 , Issue.3 , pp. 538-549
- Bertin, N.¹ Badeau, R.² Vincent, E.³

24
- 0036214787
- YIN, a fundamental frequency estimator for speech and music
- DOI 10.1121/1.1458024
- A. de Cheveigné and H. Kawahara, "YIN, A fundamental frequency estimator for speech and music," The J. Acoust. Soc. Amer., vol. 111, no. 1917, pp. 1917-1930, 2002. (Pubitemid 34297247)
- (2002) Journal of the Acoustical Society of America , vol.111 , Issue.4 , pp. 1917-1930
- De Cheveigne, A.¹

25
- 0347337997
- Multiple fundamental frequency estimation based on harmonicity and spectral smoothness
- Nov
- A. Klapuri, "Multiple fundamental frequency estimation based on harmonicity and spectral smoothness," IEEE Trans. Speech Audio Process., vol. 11, no. 6, pp. 804-816, Nov. 2003.
- (2003) IEEE Trans. Speech Audio Process. , vol.11 , Issue.6 , pp. 804-816
- Klapuri, A.¹

26
- 30844456955
- Polyphonic music transcription by non-negative sparse coding of power spectra
- S. A. Abdallah and M. D. Plumbley, "Polyphonic music transcription by non-negative sparse coding of power spectra," in Proc. Int. Conf. Music Inf. Retrieval, 2004, pp. 318-325.
- (2004) Proc. Int. Conf. Music Inf. Retrieval , pp. 318-325
- Abdallah, S.A.¹ Plumbley, M.D.²

27
- 4644242508
- Areal-timemusic-scene-descriptionsystem:Predominant-F0 estimation for detecting melody and bass lines in real-world audio signals
- M. Goto, "Areal-timemusic-scene-descriptionsystem:Predominant-F0 estimation for detecting melody and bass lines in real-world audio signals," Speech Commun., vol. 43, no. 4, pp. 311-329, 2004.
- (2004) Speech Commun. , vol.43 , Issue.4 , pp. 311-329
- Goto, M.¹

28
- 33846199251
- A discriminative model for polyphonic piano transcription
- Article ID 48317
- G. Poliner and D. P. W. Ellis, "A discriminative model for polyphonic piano transcription," EURASIP J. Adv. Signal Process., 2007, Article ID 48317.
- (2007) EURASIP J. Adv. Signal Process.
- Poliner, G.¹ Ellis, D.P.W.²

29
- 0003182324
- Organization of hierarchical perceptual sounds: Music scene analysis with autonomous processing modules and a quantitative information integration mechanism
- K. Kashino, K. Nakadai, T. Kinoshita, and H. Tanaka, "Organization of hierarchical perceptual sounds: Music scene analysis with autonomous processing modules and a quantitative information integration mechanism," in Proc. Int. Joint Conf. Artif. Intell., 1995, pp. 158-164.
- (1995) Proc. Int. Joint Conf. Artif. Intell. , pp. 158-164
- Kashino, K.¹ Nakadai, K.² Kinoshita, T.³ Tanaka, H.⁴

30
- 35048886535
- Music transcription with ISA and HMM
- E. Vincent and X. Rodet, "Music transcription with ISA and HMM," in Proc. Int. Symp. Ind. Compon. Anal. Blind Signal Separat., 2004, pp. 1197-1204. (Pubitemid 39751157)
- (2004) Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) , vol.3195 , pp. 1197-1204
- Vincent, E.¹ Rodet, X.²

31
- 57849142765
- Instrument-specific harmonic atoms for mid-level music representation
- Jan
- P. Leveau, E. Vincent, G. Richard, and L. Daudet, "Instrument- specific harmonic atoms for mid-level music representation," IEEE Trans. Audio, Speech, Lang. Process., vol. 16, no. 1, pp. 116-128, Jan. 2008.
- (2008) IEEE Trans. Audio, Speech, Lang. Process. , vol.16 , Issue.1 , pp. 116-128
- Leveau, P.¹ Vincent, E.² Richard, G.³ Daudet, L.⁴

32
- 50249173884
- A multipitch analyzer based on harmonic temporal structured clustering
- Mar
- H. Kameoka, T. Nishimoto, and S. Sagayama, "A multipitch analyzer based on harmonic temporal structured clustering," IEEE Trans. Audio, Speech, Lang. Process., vol. 15, no. 3, pp. 982-994, Mar. 2007.
- (2007) IEEE Trans. Audio, Speech, Lang. Process. , vol.15 , Issue.3 , pp. 982-994
- Kameoka, H.¹ Nishimoto, T.² Sagayama, S.³

33
- 77955816771
- Multiple-F0 tracking based on a high-order hmm model
- W. C. Chang, W. Y. Su, C. Yeh, A. Roebel, and X. Rodet, "Multiple-F0 tracking based on a high-order hmm model," in Proc. Int. Conf. Digital Audio Effects, 2008, pp. 379-386.
- (2008) Proc. Int. Conf. Digital Audio Effects , pp. 379-386
- Chang, W.C.¹ Su, W.Y.² Yeh, C.³ Roebel, A.⁴ Rodet, X.⁵

34
- 80052983176
- Harmonically informed multi-pitch tracking
- Z. Duan, J. Han, and B. Pardo, "Harmonically informed multi-pitch tracking," in Proc. Int. Soc. Music Inf. Retrieval Conf., 2009, pp. 333-338.
- (2009) Proc. Int. Soc. Music Inf. Retrieval Conf. , pp. 333-338
- Duan, Z.¹ Han, J.² Pardo, B.³

35
- 78049397081
- Song-level multi-pitch tracking by heavily constrained clustering
- Z. Duan, J. Han, and B. Pardo, "Song-level multi-pitch tracking by heavily constrained clustering," in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process., 2010, pp. 57-60.
- (2010) Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. , pp. 57-60
- Duan, Z.¹ Han, J.² Pardo, B.³

36
- 84885621082
- Relation between PLSA and NMF and implications
- E. Gaussier and C. Goutte, "Relation between PLSA and NMF and implications," in Proc. Int. ACMSIGIR Conf. Res. Develop. Inf. Retrieval, 2005, pp. 601-602.
- (2005) Proc. Int. ACMSIGIR Conf. Res. Develop. Inf. Retrieval , pp. 601-602
- Gaussier, E.¹ Goutte, C.²

37
- 0001509519
- Probabilistic latent semantic analysis
- T. Hofmann, "Probabilistic latent semantic analysis," in Proc. Conf. Uncertainty Artif. Intell., 1999, pp. 289-296.
- (1999) Proc. Conf. Uncertainty Artif. Intell. , pp. 289-296
- Hofmann, T.¹

38
- 0034320005
- Rapid speaker adaptation in eigenvoice space
- DOI 10.1109/89.876308
- R. Kuhn, J. Junqua, P. Nguyen, and N. Niedzielski, "Rapid speaker identification in eigenvoice space," IEEE Trans. Speech Audio Process., vol. 8, no. 6, pp. 695-707, Nov. 2000. (Pubitemid 32025317)
- (2000) IEEE Transactions on Speech and Audio Processing , vol.8 , Issue.6 , pp. 695-707
- Kuhn, R.¹ Junqua, J.-C.² Nguyen, P.³ Niedzielski, N.⁴

39
- 69249151355
- Speech separation using speakeradapted eigenvoice speech models
- R. J. Weiss and D. P. W. Ellis, "Speech separation using speakeradapted eigenvoice speech models," Comput. Speech Lang., vol. 24, no. 1, pp. 16-29, 2010.
- (2010) Comput. Speech Lang. , vol.24 , Issue.1 , pp. 16-29
- Weiss, R.J.¹ Ellis, D.P.W.²

40
- 0030737323
- Modeling the manifolds of images of handwritten digits
- PII S1045922797002373
- G. E. Hinton, P. Dayan, and M. Revow, "Modelling the manifolds of images and handwritten digits," IEEE Trans. Neural Netw., vol. 8, no. 1, pp. 65-74, Jan. 1997. (Pubitemid 127767781)
- (1997) IEEE Transactions on Neural Networks , vol.8 , Issue.1 , pp. 65-74
- Hinton, G.E.¹ Dayan, P.² Revow, M.³

41
- 33745478821
- Learning from incomplete ratings using non-negative matrix factorization
- Proceedings of the Sixth SIAM International Conference on Data Mining
- S. Zhang, W. Wang, J. Ford, and F. Makedon, "Learning from incomplete ratings using non-negative matrix factorization," in Proc. SIAM Int. Conf. Data Mining, 2006, pp. 549-553. (Pubitemid 43955577)
- (2006) Proceedings of the Sixth SIAM International Conference on Data Mining , vol.2006 , pp. 549-553
- Zhang, S.¹ Wang, W.² Ford, J.³ Makedon, F.⁴

42
- 0004217877
- London, U.K.: Butterworths
- C. J. van Rijsbergen, Information Retrieval, 2nd ed. London, U.K.: Butterworths, 1979.
- (1979) Information Retrieval, 2nd Ed.
- Van Rijsbergen, C.J.¹

43
- 33749074089
- Polyphonic music transcription using note event modeling
- DOI 10.1109/ASPAA.2005.1540233, 1540233, 2005 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics
- M. Ryynänen and A. Klapuri, "Polyphonic music transcription using note event modeling," in Proc. IEEE Workshop Applicat. Signal Process. Audio Acoust., 2005, pp. 319-322. (Pubitemid 44461857)
- (2005) IEEE Workshop on Applications of Signal Processing to Audio and Acoustics , pp. 319-322
- Ryynanen, M.P.¹ Klapuri, A.²

44
- 33646023117
- An introduction to ROC analysis
- T. Fawcett, "An introduction to ROC analysis," Pattern Recognit. Lett., vol. 27, no. 2006, pp. 861-874, 2005.
- (2005) Pattern Recognit. Lett. , vol.27 , Issue.2006 , pp. 861-874
- Fawcett, T.¹

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.