메뉴 건너뛰기




Volumn 32, Issue 2, 2015, Pages 125-144

Compositional models for audio processing: Uncovering the structure of sound mixtures

Author keywords

[No Author keywords available]

Indexed keywords

AUDIO ACOUSTICS;

EID: 85032751297     PISSN: 10535888     EISSN: 15580792     Source Type: Journal    
DOI: 10.1109/MSP.2013.2288990     Document Type: Article
Times cited : (68)

References (64)
  • 2
    • 38049021850 scopus 로고    scopus 로고
    • Convolutive speech bases and their application to supervised speech separation
    • P. Smaragdis, "Convolutive speech bases and their application to supervised speech separation," IEEE Trans. Audio, Speech, Lang. Processing, vol. 15, no. 1, pp. 1-12, 2007.
    • (2007) IEEE Trans. Audio, Speech, Lang. Processing , vol.15 , Issue.1 , pp. 1-12
    • Smaragdis, P.1
  • 3
    • 50249152311 scopus 로고    scopus 로고
    • Monaural sound source separation by nonnegative matrix factorization with temporal continuity and sparseness criteria
    • T. Virtanen, "Monaural sound source separation by nonnegative matrix factorization with temporal continuity and sparseness criteria," IEEE Trans. Audio, Speech, Lang. Processing, vol. 15, no. 3, pp. 1066-1074, 2007.
    • (2007) IEEE Trans. Audio, Speech, Lang. Processing , vol.15 , Issue.3 , pp. 1066-1074
    • Virtanen, T.1
  • 4
    • 79960657803 scopus 로고    scopus 로고
    • Exemplar-based sparse representations for noise robust automatic speech recognition
    • J. Gemmeke, T. Virtanen, and A. Hurmalainen, "Exemplar-based sparse representations for noise robust automatic speech recognition," IEEE Trans. Audio, Speech, Lang. Processing, vol. 19, no. 7, pp. 2067-2080, 2011.
    • (2011) IEEE Trans. Audio, Speech, Lang. Processing , vol.19 , Issue.7 , pp. 2067-2080
    • Gemmeke, J.1    Virtanen, T.2    Hurmalainen, A.3
  • 5
    • 84873616077 scopus 로고    scopus 로고
    • Musical instrument recognition in polyphonic audio using source-filter model for sound separation
    • Kobe, Japan
    • T. Heittola, A. Klapuri, and T. Virtanen, "Musical instrument recognition in polyphonic audio using source-filter model for sound separation," in Proc. Int. Conf. Music Information Retrieval, Kobe, Japan, 2009, pp. 327-332.
    • (2009) Proc. Int. Conf. Music Information Retrieval , pp. 327-332
    • Heittola, T.1    Klapuri, A.2    Virtanen, T.3
  • 7
    • 18444370569 scopus 로고    scopus 로고
    • Nonnegative features of spectro-temporal sounds for classification
    • Y.-C. Cho and S. Choi, "Nonnegative features of spectro-temporal sounds for classification," Pattern Recognit. Lett., vol. 26, no. 9, pp. 1327-1336, 2005.
    • (2005) Pattern Recognit. Lett. , vol.26 , Issue.9 , pp. 1327-1336
    • Cho, Y.-C.1    Choi, S.2
  • 8
    • 76949083547 scopus 로고    scopus 로고
    • Enforcing harmonicity and smoothness in Bayesian nonnegative matrix factorization applied to polyphonic music transcription
    • N. Bertin, R. Badeau, and E. Vincent, "Enforcing harmonicity and smoothness in Bayesian nonnegative matrix factorization applied to polyphonic music transcription," IEEE Trans. Audio, Speech, Lang. Processing, vol. 18, no. 3, pp. 538-549, 2010.
    • (2010) IEEE Trans. Audio, Speech, Lang. Processing , vol.18 , Issue.3 , pp. 538-549
    • Bertin, N.1    Badeau, R.2    Vincent, E.3
  • 9
    • 33745219863 scopus 로고    scopus 로고
    • Bandwidth expansion of narrowband speech using non-negative matrix factorization
    • Lisbon, Portugal
    • D. Bansal, B. Raj, and P. Smaragdis, "Bandwidth expansion of narrowband speech using non-negative matrix factorization," in Proc. EUROSPEECH, Lisbon, Portugal, 2005, pp. 1505-1508.
    • (2005) Proc. EUROSPEECH , pp. 1505-1508
    • Bansal, D.1    Raj, B.2    Smaragdis, P.3
  • 10
    • 76949094445 scopus 로고    scopus 로고
    • Multichannel nonnegative matrix factorization in convolutive mixtures for audio source separation
    • A. Ozerov and C. Févotte, "Multichannel nonnegative matrix factorization in convolutive mixtures for audio source separation," IEEE Trans. Audio, Speech, Lang. Processing, vol. 18, no. 3, pp. 550-563, 2010.
    • (2010) IEEE Trans. Audio, Speech, Lang. Processing , vol.18 , Issue.3 , pp. 550-563
    • Ozerov, A.1    Févotte, C.2
  • 11
    • 77952744810 scopus 로고    scopus 로고
    • Sparse representations in audio & music: From coding to source separation
    • M. D. Plumbley, T. Blumensath, L. Daudet, R. Gribonval, and M. E. Davies, "Sparse representations in audio & music: From coding to source separation," Proc. IEEE, vol. 98, no. 6, pp. 995-1005, 2009.
    • (2009) Proc. IEEE , vol.98 , Issue.6 , pp. 995-1005
    • Plumbley, M.D.1    Blumensath, T.2    Daudet, L.3    Gribonval, R.4    Davies, M.E.5
  • 12
  • 13
    • 63249085556 scopus 로고    scopus 로고
    • Nonnegative matrix factorization with the Itakura-Saito divergence. with application to music analysis
    • C. Févotte, N. Bertin, and J.-L. Durrieu, "Nonnegative matrix factorization with the Itakura-Saito divergence. With application to music analysis," Neural Computat., vol. 21, no. 3, pp. 793-830, 2009.
    • (2009) Neural Computat. , vol.21 , Issue.3 , pp. 793-830
    • Févotte, C.1    Bertin, N.2    Durrieu, J.-L.3
  • 15
    • 41249089920 scopus 로고    scopus 로고
    • On the equivalence between nonnegative matrix factorization and probabilistic latent semantic indexing
    • C. Ding, T. Li, and W. Ping, "On the equivalence between nonnegative matrix factorization and probabilistic latent semantic indexing," Computat. Stat. Data Anal., vol. 52, no. 8, pp. 3913-3927, 2008.
    • (2008) Computat. Stat. Data Anal. , vol.52 , Issue.8 , pp. 3913-3927
    • Ding, C.1    Li, T.2    Ping, W.3
  • 20
    • 84866042020 scopus 로고    scopus 로고
    • Optimization and parallelization of monaural source separation algorithms in the openBliSSART toolkit
    • F. Weninger and B. Schuller, "Optimization and parallelization of monaural source separation algorithms in the openBliSSART toolkit," J. Signal Process. Syst., vol. 69, no. 3, pp. 267-277, 2012.
    • (2012) J. Signal Process. Syst. , vol.69 , Issue.3 , pp. 267-277
    • Weninger, F.1    Schuller, B.2
  • 21
    • 80051594038 scopus 로고    scopus 로고
    • Algorithms for nonnegative matrix factorization with the beta-divergence
    • C. Févotte and J. Idier, "Algorithms for nonnegative matrix factorization with the beta-divergence," Neural Computat., vol. 23, no. 9, pp. 2421-2456, 2011.
    • (2011) Neural Computat. , vol.23 , Issue.9 , pp. 2421-2456
    • Févotte, C.1    Idier, J.2
  • 23
    • 34247173538 scopus 로고    scopus 로고
    • Nonnegative matrix factorization with constrained second-order optimization
    • R. Zdunek and A. Cichocki, "Nonnegative matrix factorization with constrained second-order optimization," Signal Process., vol. 87, no. 8, pp. 1904-1916, 2007.
    • (2007) Signal Process. , vol.87 , Issue.8 , pp. 1904-1916
    • Zdunek, R.1    Cichocki, A.2
  • 24
    • 84863012243 scopus 로고    scopus 로고
    • Fast nonnegative matrix factorization: An active-set-like method and comparisons
    • J. Kim and H. Park, "Fast nonnegative matrix factorization: An active-set-like method and comparisons," SIAM J. Sci. Comput., vol. 33, no. 6, pp. 3261-3281, 2011.
    • (2011) SIAM J. Sci. Comput. , vol.33 , Issue.6 , pp. 3261-3281
    • Kim, J.1    Park, H.2
  • 25
    • 84886818613 scopus 로고    scopus 로고
    • Active-set Newton algorithm for overcomplete nonnegative representations of audio
    • T. Virtanen, J. Gemmeke, and B. Raj, "Active-set Newton algorithm for overcomplete nonnegative representations of audio," IEEE Trans. Audio, Speech, Lang. Processing, vol. 21, no. 11, 2013.
    • IEEE Trans. Audio, Speech, Lang. Processing , vol.21 , Issue.11 , pp. 2013
    • Virtanen, T.1    Gemmeke, J.2    Raj, B.3
  • 26
    • 0034818212 scopus 로고    scopus 로고
    • Unsupervised learning by probabilistic latent semantic analysis
    • T. Hofmann, "Unsupervised learning by probabilistic latent semantic analysis," Mach. Learn., vol. 42, no. 1-2, pp. 177-196, 2001.
    • (2001) Mach. Learn. , vol.42 , Issue.1-2 , pp. 177-196
    • Hofmann, T.1
  • 27
    • 47649133016 scopus 로고    scopus 로고
    • Probabilistic latent variable models as nonnegative factorizations
    • M. Shashanka, B. Raj, and P. Smaragdis, "Probabilistic latent variable models as nonnegative factorizations," Computat. Intell. Neurosci., vol. 2008, 2008.
    • (2008) Computat. Intell. Neurosci. , vol.2008
    • Shashanka, M.1    Raj, B.2    Smaragdis, P.3
  • 29
    • 81855166765 scopus 로고    scopus 로고
    • Missing data imputation for timefrequency representations of audio signals
    • P. Smaragdis, B. Raj, and M. Shashanka, "Missing data imputation for timefrequency representations of audio signals," J. Signal Process. Syst., vol. 11, no. 3, pp. 361-370, 2011.
    • (2011) J. Signal Process. Syst. , vol.11 , Issue.3 , pp. 361-370
    • Smaragdis, P.1    Raj, B.2    Shashanka, M.3
  • 32
    • 84900510076 scopus 로고    scopus 로고
    • Nonnegative matrix factorization with sparseness constraints
    • P. O. Hoyer, "Nonnegative matrix factorization with sparseness constraints," J. Mach. Learn. Res., vol. 5, pp. 1457-1469, 2004.
    • (2004) J. Mach. Learn. Res. , vol.5 , pp. 1457-1469
    • Hoyer, P.O.1
  • 34
    • 84863746770 scopus 로고    scopus 로고
    • Spectral covariance in prior distributions of nonnegative matrix factorization based speech separation
    • Glasgow, Scotland
    • T. Virtanen, "Spectral covariance in prior distributions of nonnegative matrix factorization based speech separation," in Proc. European Signal Processing Conf., Glasgow, Scotland, 2009, pp. 1933-1937.
    • (2009) Proc. European Signal Processing Conf. , pp. 1933-1937
    • Virtanen, T.1
  • 35
    • 84858719009 scopus 로고    scopus 로고
    • A sparse non-parametric approach for single channel separation of known sounds
    • Vancouver, Canada
    • P. Smaragdis, M. Shashanka, and B. Raj, "A sparse non-parametric approach for single channel separation of known sounds," in Proc. Neural Information Processing Systems, Vancouver, Canada, 2009, pp. 1705-1713.
    • (2009) Proc. Neural Information Processing Systems , pp. 1705-1713
    • Smaragdis, P.1    Shashanka, M.2    Raj, B.3
  • 36
    • 0021407831 scopus 로고
    • Signal estimation from modified short-time Fourier transform
    • D. Griffin and J. Lim, "Signal estimation from modified short-time Fourier transform," IEEE Trans. Acoustics, Speech, Signal Processing, vol. 32, no. 2, pp. 236-242, 1984.
    • (1984) IEEE Trans. Acoustics, Speech, Signal Processing , vol.32 , Issue.2 , pp. 236-242
    • Griffin, D.1    Lim, J.2
  • 37
    • 84873346243 scopus 로고    scopus 로고
    • Consistent Wiener filtering for audio source separation
    • J. Le Roux and E. Vincent, "Consistent Wiener filtering for audio source separation," IEEE Signal Processing Lett., vol. 20, no. 3, pp. 217-220, 2013.
    • (2013) IEEE Signal Processing Lett. , vol.20 , Issue.3 , pp. 217-220
    • Le Roux, J.1    Vincent, E.2
  • 39
    • 85032751965 scopus 로고    scopus 로고
    • Compressive sensing
    • R. G. Baraniuk, "Compressive sensing," IEEE Signal Processing Mag., vol. 24, no. 4, pp. 118-121, 2007.
    • (2007) IEEE Signal Processing Mag. , vol.24 , Issue.4 , pp. 118-121
    • Baraniuk, R.G.1
  • 41
    • 67650927380 scopus 로고    scopus 로고
    • Bayesian inference for nonnegative matrix factorisation models
    • A. T. Cemgil, "Bayesian inference for nonnegative matrix factorisation models," Computat. Intell. Neurosci., vol. 2009, 2009.
    • (2009) Computat. Intell. Neurosci. , vol.2009
    • Cemgil, A.T.1
  • 44
    • 84878609401 scopus 로고    scopus 로고
    • Group sparsity for speaker identity discrimination in factorisation-based speech recognition
    • Portland, OR, Oregon
    • A. Hurmalainen, R. Saeidi, and T. Virtanen, "Group sparsity for speaker identity discrimination in factorisation-based speech recognition," in Proc. Interspeech 2012, Portland, OR, Oregon.
    • (2012) Proc. Interspeech
    • Hurmalainen, A.1    Saeidi, R.2    Virtanen, T.3
  • 46
    • 79959843124 scopus 로고    scopus 로고
    • Using sparse representations for exemplar based continuous digit recognition
    • Glasgow, Scotland
    • J. Gemmeke, L. ten Bosch, L. Boves, and B. Cranen, "Using sparse representations for exemplar based continuous digit recognition," in Proc. European Signal Processing Conf., Glasgow, Scotland, 2009, pp. 24-28.
    • (2009) Proc. European Signal Processing Conf. , pp. 24-28
    • Gemmeke, J.1    Ten Bosch, L.2    Boves, L.3    Cranen, B.4
  • 47
    • 84865759533 scopus 로고    scopus 로고
    • Mapping sparse representation to state likelihoods in noise-robust automatic speech recognition
    • Florence, Italy
    • K. Mahkonen, A. Hurmalainen, T. Virtanen, and J. F. Gemmeke, "Mapping sparse representation to state likelihoods in noise-robust automatic speech recognition," in Proc. Interspeech 2011, Florence, Italy, pp. 465-468.
    • (2011) Proc. Interspeech , pp. 465-468
    • Mahkonen, K.1    Hurmalainen, A.2    Virtanen, T.3    Gemmeke, J.F.4
  • 48
    • 84878576404 scopus 로고    scopus 로고
    • Using sparse classification outputs as feature observations for noise-robust ASR
    • Portland, OR
    • Y. Sun, B. Cranen, J. F. Gemmeke, L. Boves, L. ten Bosch, and M. M. Doss, "Using sparse classification outputs as feature observations for noise-robust ASR," in Proc. Interspeech 2012, Portland, OR.
    • (2012) Proc. Interspeech
    • Sun, Y.1    Cranen, B.2    Gemmeke, J.F.3    Boves, L.4    Ten Bosch, L.5    Doss, M.M.6
  • 52
    • 84901803470 scopus 로고    scopus 로고
    • Exemplar-based voice conversion using nonnegative spectrogram deconvolution
    • Barcelona, Spain
    • Z. Wu, T. Virtanen, T. Kinnunen, E. S. Chng, and H. Li, "Exemplar-based voice conversion using nonnegative spectrogram deconvolution," in Proc. 8th ISCA Speech Synthesis Workshop, Barcelona, Spain, 2013, pp. 201-206.
    • (2013) Proc. 8th ISCA Speech Synthesis Workshop , pp. 201-206
    • Wu, Z.1    Virtanen, T.2    Kinnunen, T.3    Chng, E.S.4    Li, H.5
  • 53
    • 77949695902 scopus 로고    scopus 로고
    • Compressive sensing for missing data imputation in noise robust speech recognition
    • J. F. Gemmeke, H. Vanhamme, B. Cranen, and L. Boves, "Compressive sensing for missing data imputation in noise robust speech recognition," IEEE J. Sel. Top. Signal Processing, vol. 4, no. 2, pp. 272-287, 2010.
    • (2010) IEEE J. Sel. Top. Signal Processing , vol.4 , Issue.2 , pp. 272-287
    • Gemmeke, J.F.1    Vanhamme, H.2    Cranen, B.3    Boves, L.4
  • 54
    • 79953665879 scopus 로고    scopus 로고
    • Computational auditory induction as a missing-data model-fitting problem with Bregman divergence
    • J. Le Roux, H. Kameoka, N. Ono, A. de Cheveigné, and S. Sagayama, "Computational auditory induction as a missing-data model-fitting problem with Bregman divergence," SIAM J. Sci. Comput., vol. 54, no. 5, pp. 658-676, 2011.
    • (2011) SIAM J. Sci. Comput. , vol.54 , Issue.5 , pp. 658-676
    • Le Roux, J.1    Kameoka, H.2    Ono, N.3    De Cheveigné, A.4    Sagayama, S.5
  • 55
    • 80052984197 scopus 로고    scopus 로고
    • A musically motivated mid-level representation for pitch estimation and musical audio source separation
    • J.-L. Durrieu, B. David, and G. Richard, "A musically motivated mid-level representation for pitch estimation and musical audio source separation," IEEE J. Sel. Top. Signal Processing, vol. 5, no. 6, pp. 1180-1191, 2011.
    • (2011) IEEE J. Sel. Top. Signal Processing , vol.5 , Issue.6 , pp. 1180-1191
    • Durrieu, J.-L.1    David, B.2    Richard, G.3
  • 58
    • 84897584695 scopus 로고    scopus 로고
    • A general flexible framework for the handling of prior information in audio source separation
    • A. Ozerov, E. Vincent, and F. Bimbot, "A general flexible framework for the handling of prior information in audio source separation," IEEE Trans. Audio, Speech, Lang. Process., vol. 20, no. 4, pp. 1118-1133, 2012.
    • (2012) IEEE Trans. Audio, Speech, Lang. Process. , vol.20 , Issue.4 , pp. 1118-1133
    • Ozerov, A.1    Vincent, E.2    Bimbot, F.3
  • 60
    • 78049380787 scopus 로고    scopus 로고
    • Latent-variable decomposition based dereverberation of monaural and multi-channel signals
    • Dallas, TX
    • R. Singh, B. Raj, and P. Smaragdis, "Latent-variable decomposition based dereverberation of monaural and multi-channel signals," in Proc. IEEE Int. Conf. Audio, Speech and Signal Processing, Dallas, TX, 2010, pp. 1914-1917.
    • (2010) Proc. IEEE Int. Conf. Audio, Speech and Signal Processing , pp. 1914-1917
    • Singh, R.1    Raj, B.2    Smaragdis, P.3
  • 62
    • 47649088496 scopus 로고    scopus 로고
    • Extended nonnegative tensor factorisation models for musical source separation
    • D. FitzGerald, M. Cranitch, and E. Coyle, "Extended nonnegative tensor factorisation models for musical source separation," Computat. Intell. Neurosci., vol. 2008, 2008.
    • (2008) Computat. Intell. Neurosci. , vol.2008
    • Fitzgerald, D.1    Cranitch, M.2    Coyle, E.3
  • 63
    • 0002740437 scopus 로고
    • Foundations of the PARAFAC procedure: Models and conditions for an explanatory" multimodal factor analysis
    • R. A. Harshman, "Foundations of the PARAFAC procedure: Models and conditions for an "explanatory" multimodal factor analysis," in UCLA Working Papers in Phonetics, vol. 16, pp. 1-84, 1970.
    • (1970) UCLA Working Papers in Phonetics , vol.16 , pp. 1-84
    • Harshman, R.A.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.