SCOPUS 정보 검색 플랫폼

IEEE Transactions on Audio, Speech and Language Processing

Volumn 15, Issue 5, 2007, Pages 1564-1578

Adaptation of bayesian models for single-channel source separation and its application to voice/music separation in popular songs

(4) Ozerov, Alexey a,b,c Philippe, Pierrick a Bimbot, Frédéric b Gribonval, Rémi b

a ORANGE LABS (France)

b INRIA (France)

c ROYAL INSTITUTE OF TECHNOLOGY (Sweden)

Author keywords

Adaptive Wiener filtering; Bayesian model; Expectation maximization (EM); Gaussian mixture model (GMM); Maximum a posteriori (MAP); Model adaptation; Single channel source separation; Time frequency masking

Indexed keywords

ADAPTIVE WIENER FILTERING; BAYESIAN MODEL; EXPECTATION MAXIMIZATION (EM); GAUSSIAN MIXTURE MODEL (GMM); MAXIMUM A POSTERIORI (MAP); MODEL ADAPTATION; SINGLE-CHANNEL SOURCE SEPARATION; TIME-FREQUENCY MASKING;

ADAPTIVE FILTERING; CHANNEL CAPACITY; COMMUNICATION CHANNELS (INFORMATION THEORY); IMAGE SEGMENTATION; MAGNETOSTRICTIVE DEVICES; MAXIMUM PRINCIPLE; OBJECT RECOGNITION; SEPARATION; SIGNAL ANALYSIS; TRELLIS CODES;

BAYESIAN NETWORKS;

EID: 51449094735 PISSN: 15587916 EISSN: None Source Type: Journal
DOI: 10.1109/TASL.2007.899291 Document Type: Article

Times cited : (163)

References (42)

1
- 0032682770
- Separation of speech from interfering sounds based on oscillatory correlation
- May
- D. L. Wang and G. J. Brown, "Separation of speech from interfering sounds based on oscillatory correlation," IEEE Trans. Neural Netw., vol. 10, no. 3, pp. 684-697, May 1999.
- (1999) IEEE Trans. Neural Netw , vol.10 , Issue.3 , pp. 684-697
- Wang, D.L.¹ Brown, G.J.²

2
- 8344232372
- A maximum likelihood approach to single-channel source separation
- G.-J. Jang and T.-W. Lee, "A maximum likelihood approach to single-channel source separation," J. Mach. Learning Res., no. 4, pp. 1365-1392, 2003.
- (2003) J. Mach. Learning Res , Issue.4 , pp. 1365-1392
- Jang, G.-J.¹ Lee, T.-W.²

3
- 4544247508
- Multiband audio modeling for single-channel acoustic source separation
- May
- M. Reyes-Gomez, D. Ellis, and N. Jojic, "Multiband audio modeling for single-channel acoustic source separation," in Proc. IEEE Int. Conf. Acoust., Speech, Signal Proc. (ICASSP'04), May 2004, vol. 5, pp. 641-644.
- (2004) Proc. IEEE Int. Conf. Acoust., Speech, Signal Proc. (ICASSP'04) , vol.5 , pp. 641-644
- Reyes-Gomez, M.¹ Ellis, D.² Jojic, N.³

4
- 25444438090
- Monaural source separation using spectral cues
- B. Pearlmutter and A. Zador, "Monaural source separation using spectral cues," in Proc. 5th Int. Conf. Ind. Compon. Anal. (ICA'04), 2004, pp. 478-485.
- (2004) Proc. 5th Int. Conf. Ind. Compon. Anal. (ICA'04) , pp. 478-485
- Pearlmutter, B.¹ Zador, A.²

5
- 84898946024
- One microphone source separation
- Cambridge, MA: MIT Press
- S. T. Roweis, "One microphone source separation," in Advances in Neural Information Processing Systems. Cambridge, MA: MIT Press, 2001, vol. 13, pp. 793-799.
- (2001) Advances in Neural Information Processing Systems , vol.13 , pp. 793-799
- Roweis, S.T.¹

6
- 35048894844
- Wiener based source separation with HMM/GMM using a single sensor
- Nara, Japan, Apr
- L. Benaroya and F. Bimbot, "Wiener based source separation with HMM/GMM using a single sensor," in Proc. Int. Conf. Ind. Compon. Anal. Blind Source Separation (ICA'03), Nara, Japan, Apr. 2003, pp. 957-961.
- (2003) Proc. Int. Conf. Ind. Compon. Anal. Blind Source Separation (ICA'03) , pp. 957-961
- Benaroya, L.¹ Bimbot, F.²

7
- 4644257621
- Single microphone source separation using high resolution signal reconstruction
- T. Kristjansson, H. Attias, and J. Hershey, "Single microphone source separation using high resolution signal reconstruction," in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. (ICASSP'04), 2004, vol. 2, pp. 817-820.
- (2004) Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. (ICASSP'04) , vol.2 , pp. 817-820
- Kristjansson, T.¹ Attias, H.² Hershey, J.³

8
- 85159664446
- SINOLA: A new analysis/synthesis method using spectrum peak shape distortion, phase and reassigned spectrum
- Oct
- G. Peeters and X. Rodet, "SINOLA: A new analysis/synthesis method using spectrum peak shape distortion, phase and reassigned spectrum," in Proc. Int. Comput. Music Conf. (ICMC'99), Oct. 1999, pp. 153-156.
- (1999) Proc. Int. Comput. Music Conf. (ICMC'99) , pp. 153-156
- Peeters, G.¹ Rodet, X.²

9
- 0036293936
- On the approximate W-disjoint orthogonality of speech
- Orlando, FL, May
- S. Rickard and O. Yilmaz, "On the approximate W-disjoint orthogonality of speech," in IEEE Int. Conf. Acoust., Speech, Signal Process. (ICASSP'02), Orlando, FL, May 2002, vol. 3, pp. 3049-3052.
- (2002) IEEE Int. Conf. Acoust., Speech, Signal Process. (ICASSP'02) , vol.3 , pp. 3049-3052
- Rickard, S.¹ Yilmaz, O.²

10
- 64149129931
- Fast monaural separation of speech
- N. H. Pontoppidan and M. Dyrholm, "Fast monaural separation of speech," in Proc. 23rd Conf. Signal Process. Audio Recording Reproduction Audio Eng. Soc. (AES) , 2003.
- (2003) Proc. 23rd Conf. Signal Process. Audio Recording Reproduction Audio Eng. Soc. (AES)
- Pontoppidan, N.H.¹ Dyrholm, M.²

11
- 35048837133
- Underdetermined source separation with structured source priors
- Granada, Spain, Sep
- E. Vincent and X. Rodet, "Underdetermined source separation with structured source priors," in Int. Conf. Ind. Compon. Anal. Blind Source Separation (ICA'04), Granada, Spain, Sep. 2004, pp. 327-334.
- (2004) Int. Conf. Ind. Compon. Anal. Blind Source Separation (ICA'04) , pp. 327-334
- Vincent, E.¹ Rodet, X.²

12
- 4544386386
- Low complexity Bayesian single channel source separation
- T. Beierholm, B. D. Pedersen, and O. Winther, "Low complexity Bayesian single channel source separation," in IEEE Int. Conf. Acoust., Speech, Signal Process. (ICASSP'04), 2004, vol. 5, pp. 529-532.
- (2004) IEEE Int. Conf. Acoust., Speech, Signal Process. (ICASSP'04) , vol.5 , pp. 529-532
- Beierholm, T.¹ Pedersen, B.D.² Winther, O.³

13
- 33947659500
- Model-based monaural source separation using a vector-quantized phase-vocoder representation
- Toulouse, France, May
- D. Ellis and R.Weiss, "Model-based monaural source separation using a vector-quantized phase-vocoder representation," in IEEE Int. Conf. Acoust., Speech, Signal Process. (ICASSP'06), Toulouse, France, May 2006, vol. 5, pp. 957-960.
- (2006) IEEE Int. Conf. Acoust., Speech, Signal Process. (ICASSP'06) , vol.5 , pp. 957-960
- Ellis, D.¹ Weiss, R.²

14
- 0003837293
- Englewood Cliffs, NJ: Prentice-Hall
- S. M. Kay, Fundamentals of Statistical Signal Processing, Estimation Theory. Englewood Cliffs, NJ: Prentice-Hall, 1993.
- (1993) Fundamentals of Statistical Signal Processing, Estimation Theory
- Kay, S.M.¹

15
- 0002629270
- Maximum likelihood from incomplete data via the EM algorithm
- A. P. Dempster, N. M. Laird, and D. B. Rubin, "Maximum likelihood from incomplete data via the EM algorithm," J. R. Statist. Soc., vol. 39, pp. 1-38, 1977.
- (1977) J. R. Statist. Soc , vol.39 , pp. 1-38
- Dempster, A.P.¹ Laird, N.M.² Rubin, D.B.³

16
- 0141743693
- New EM algorithms for source separation and deconvolution
- H. Attias, "New EM algorithms for source separation and deconvolution," in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. (ICASSP'03), 2003, vol. 5, pp. 297-300.
- (2003) Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. (ICASSP'03) , vol.5 , pp. 297-300
- Attias, H.¹

17
- 0028419019
- Maximum a posteriori estimation for multivariate Gaussian mixture observations of markov chains
- Apr
- J. Gauvain and C. Lee, "Maximum a posteriori estimation for multivariate Gaussian mixture observations of markov chains," IEEE Trans. Speech Audio Process., vol. 2, no. 2, pp. 291-298, Apr. 1994.
- (1994) IEEE Trans. Speech Audio Process , vol.2 , Issue.2 , pp. 291-298
- Gauvain, J.¹ Lee, C.²

18
- 0000159105
- On adaptive decision rules and decision parameter adaptation for automatic speech recognition
- Aug
- C.-H. Lee and Q. Huo, "On adaptive decision rules and decision parameter adaptation for automatic speech recognition," Proc. IEEE, vol. 88, no. 8, pp. 1241-1269, Aug. 2000.
- (2000) Proc. IEEE , vol.88 , Issue.8 , pp. 1241-1269
- Lee, C.-H.¹ Huo, Q.²

19
- 0033884858
- Speaker verification using adapted Gaussian mixture models
- A. Reynolds, T. Quatieri, and R. Dunn, "Speaker verification using adapted Gaussian mixture models," Digital Signal Process., no. 10, pp. 19-41, 2000.
- (2000) Digital Signal Process , Issue.10 , pp. 19-41
- Reynolds, A.¹ Quatieri, T.² Dunn, R.³

20
- 0013288412
- Dynamic Bayesian networks: Representation, inference and learning,
- Ph.D. dissertation, Univ. California Berkeley, Berkeley, CA, Jul
- K. P. Murphy, "Dynamic Bayesian networks: Representation, inference and learning," Ph.D. dissertation, Univ. California Berkeley, Berkeley, CA, Jul. 2002.
- (2002)
- Murphy, K.P.¹

21
- 0033225865
- An introduction to variational methods for graphical models
- M. I. Jordan, Z. Ghahramani, T. S. Jaakkola, and L. K. Saul, "An introduction to variational methods for graphical models," Learning in Graphical Models, vol. 37, no. 2, pp. 183-233, 1999.
- (1999) Learning in Graphical Models , vol.37 , Issue.2 , pp. 183-233
- Jordan, M.I.¹ Ghahramani, Z.² Jaakkola, T.S.³ Saul, L.K.⁴

22
- 0009623939
- Flexible speaker adaptation using maximum likelihood linear regression
- C. Leggetter and P.Woodland, "Flexible speaker adaptation using maximum likelihood linear regression," in ARPA Spoken Lang. Technol. Workshop, 1995, pp. 104-109.
- (1995) ARPA Spoken Lang. Technol. Workshop , pp. 104-109
- Leggetter, C.¹ Woodland, P.²

23
- 0030359637
- Variance compensation within the MLLR framework for robust speech recognition and speaker adaptation
- Philadelphia, PA
- M. Gales, D. Pye, and P. Woodland, "Variance compensation within the MLLR framework for robust speech recognition and speaker adaptation," in Proc. Int. Conf. Spoken Lang. Process. (ICSLP'96), Philadelphia, PA, 1996, vol. 3, pp. 1832-1835.
- (1996) Proc. Int. Conf. Spoken Lang. Process. (ICSLP'96) , vol.3 , pp. 1832-1835
- Gales, M.¹ Pye, D.² Woodland, P.³

24
- 0030640789
- Structural MAP speaker adaptation using hierarchical priors
- Santa Barbara, CA, Dec
- K. Shinoda and C.-H. Lee, "Structural MAP speaker adaptation using hierarchical priors," in Proc. IEEE Workshop Speech Recognition Understanding, Santa Barbara, CA, Dec. 1997, pp. 381-388.
- (1997) Proc. IEEE Workshop Speech Recognition Understanding , pp. 381-388
- Shinoda, K.¹ Lee, C.-H.²

25
- 85009097035
- Fast speaker adaptation using eigenspace-based maximum likelihood linear regression
- Beijing, China, Oct
- K.-T. Chen, W.-W. Liau, H.-M. Wang, and L.-S. Lee, "Fast speaker adaptation using eigenspace-based maximum likelihood linear regression," in Proc. Int. Conf. Spoken Lang. Process. (ICSLP'00), Beijing, China, Oct. 2000, pp. 742-745.
- (2000) Proc. Int. Conf. Spoken Lang. Process. (ICSLP'00) , pp. 742-745
- Chen, K.-T.¹ Liau, W.-W.² Wang, H.-M.³ Lee, L.-S.⁴

26
- 33744968614
- Audio source separation with a single sensor
- Jan
- L. Benaroya, F. Bimbot, and R. Gribonval, "Audio source separation with a single sensor," IEEE Trans. Audio, Speech, Lang. Process., vol. 14, no. 1, pp. 191-199, Jan. 2006.
- (2006) IEEE Trans. Audio, Speech, Lang. Process , vol.14 , Issue.1 , pp. 191-199
- Benaroya, L.¹ Bimbot, F.² Gribonval, R.³

27
- 0028420014
- Integrated models of signal and background with application to speaker identification in noise
- Apr
- R. C. Rose, E. M. Hofstetter, and D. A. Reynolds, "Integrated models of signal and background with application to speaker identification in noise," IEEE Trans. Speech Audio Process., vol. 2, no. 2, pp. 245-257, Apr. 1994.
- (1994) IEEE Trans. Speech Audio Process , vol.2 , Issue.2 , pp. 245-257
- Rose, R.C.¹ Hofstetter, E.M.² Reynolds, D.A.³

28
- 4444245782
- Blind clustering of popular music recordings based on singer voice characteristics
- W.-H. Tsai, D. Rogers, and H.-M. Wang, "Blind clustering of popular music recordings based on singer voice characteristics," Comput. Music J., vol. 28, no. 3, pp. 68-78, 2004.
- (2004) Comput. Music J , vol.28 , Issue.3 , pp. 68-78
- Tsai, W.-H.¹ Rogers, D.² Wang, H.-M.³

29
- 0035688686
- Locating singing voice segments within music signals
- A. Berenzweig and D. P. W. Ellis, "Locating singing voice segments within music signals," in Proc. IEEE Workshop Applicat. Signal Process. Audio Acoust. (WASPAA'01), 2001, pp. 119-122.
- (2001) Proc. IEEE Workshop Applicat. Signal Process. Audio Acoust. (WASPAA'01) , pp. 119-122
- Berenzweig, A.¹ Ellis, D.P.W.²

30
- 4444229791
- Singer identification in popular music recordings using voice coding features
- Oct
- Y. E. Kim and B. Whitman, "Singer identification in popular music recordings using voice coding features," in Proc. Int. Symp. Music Inf. Retrieval (ISMIR'02), Oct. 2002, pp. 164-169.
- (2002) Proc. Int. Symp. Music Inf. Retrieval (ISMIR'02) , pp. 164-169
- Kim, Y.E.¹ Whitman, B.²

31
- 13444291977
- Singing voice detection in popular music
- New York, Oct
- T. L. Nwe, A. Shenoy, and Y. Wang, "Singing voice detection in popular music," in Proc. ACM Multimedia Conf., New York, Oct. 2004, pp. 324-327.
- (2004) Proc. ACM Multimedia Conf , pp. 324-327
- Nwe, T.L.¹ Shenoy, A.² Wang, Y.³

32
- 4544255234
- Automatic detection and tracking of target singer in multi-singer music recordings
- Montreal, QC, Canada
- W. H. Tsai and H. M. Wang, "Automatic detection and tracking of target singer in multi-singer music recordings," in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. (ICASSP'04),Montreal, QC, Canada, 2004, vol. 4, pp. 221-224.
- (2004) Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. (ICASSP'04) , vol.4 , pp. 221-224
- Tsai, W.H.¹ Wang, H.M.²

33
- 0032595188
- Generalized mel frequency cepstral coefficients for large-vocabulary speaker-independent continuous-speech recognition
- Sep
- R. Vergin, D. O'Shaughnessy, and A. Farhat, "Generalized mel frequency cepstral coefficients for large-vocabulary speaker-independent continuous-speech recognition," IEEE Trans. Speech Audio Process., vol. 7, no. 5, pp. 525-532, Sep. 1999.
- (1999) IEEE Trans. Speech Audio Process , vol.7 , Issue.5 , pp. 525-532
- Vergin, R.¹ O'Shaughnessy, D.² Farhat, A.³

34
- 33745686986
- One microphone singing voice separation using source-adapted models
- Mohonk, NY, Oct
- A. Ozerov, P. Philippe, R. Gribonval, and F. Bimbot, "One microphone singing voice separation using source-adapted models," in IEEE Workshop Applicat. Signal Process. Audio Acoust. (WASPAA'05), Mohonk, NY, Oct. 2005, pp. 90-93.
- (2005) IEEE Workshop Applicat. Signal Process. Audio Acoust. (WASPAA'05) , pp. 90-93
- Ozerov, A.¹ Philippe, P.² Gribonval, R.³ Bimbot, F.⁴

35
- 0028517016
- Space-alternating generalized expectation- maximization algorithm
- Oct
- J. A. Fessler and A. O. Hero, "Space-alternating generalized expectation- maximization algorithm," IEEE Trans. Signal Process., vol. 42, no. 10, pp. 2664-2677, Oct. 1994.
- (1994) IEEE Trans. Signal Process , vol.42 , Issue.10 , pp. 2664-2677
- Fessler, J.A.¹ Hero, A.O.²

36
- 84948109721
- New York: Wiley
- G. McLachlan and T. Krishnan, The EM Algorithm and Extensions. New York: Wiley, 1997.
- (1997) The EM Algorithm and Extensions
- McLachlan, G.¹ Krishnan, T.²

37
- 84867608170
- Low-resource noise-robust feature post-processing on aurora 2.0
- C.-P. Chen, J. Bilmes, and K. Kirchhoff, "Low-resource noise-robust feature post-processing on aurora 2.0," in Proc. Int. Conf. Spoken Lang. Process. (ICSLP'02), 2002, pp. 2445-2448.
- (2002) Proc. Int. Conf. Spoken Lang. Process. (ICSLP'02) , pp. 2445-2448
- Chen, C.-P.¹ Bilmes, J.² Kirchhoff, K.³

38
- 85046873967
- The DET curve in assessment of detection task performance
- A. Martin, G. Doddington, T. Kamm, M. Ordowski, and M. Przybocki, "The DET curve in assessment of detection task performance," in Proc. Eur. Conf. Speech Commun. Technol. (EuroSpeech'97), 1997, pp. 1895-1898.
- (1997) Proc. Eur. Conf. Speech Commun. Technol. (EuroSpeech'97) , pp. 1895-1898
- Martin, A.¹ Doddington, G.² Kamm, T.³ Ordowski, M.⁴ Przybocki, M.⁵

39
- 0020102027
- Least squares quantization in PCM
- Mar
- S. P. Lloyd, "Least squares quantization in PCM," IEEE Trans. Inf. Theory, vol. IT-28, no. 2, pp. 129-137, Mar. 1982.
- (1982) IEEE Trans. Inf. Theory , vol.IT-28 , Issue.2 , pp. 129-137
- Lloyd, S.P.¹

40
- 0348196088
- Proposals for performance measurement in source separation
- Apr
- R. Gribonval, L. Benaroya, E. Vincent, and C. Févotte, "Proposals for performance measurement in source separation," in Proc. Int. Conf. Ind. Compon. Anal. Blind Source Separation (ICA'03), Apr. 2003, pp. 763-768.
- (2003) Proc. Int. Conf. Ind. Compon. Anal. Blind Source Separation (ICA'03) , pp. 763-768
- Gribonval, R.¹ Benaroya, L.² Vincent, E.³ Févotte, C.⁴

41
- 84873538214
- Separation of vocals from polyphonic audio recordings
- S. Vembu and S. Baumann, "Separation of vocals from polyphonic audio recordings," in Proc. Int. Symp. Music Inf. Retrieval (ISMIR'05), 2005, pp. 337-344.
- (2005) Proc. Int. Symp. Music Inf. Retrieval (ISMIR'05) , pp. 337-344
- Vembu, S.¹ Baumann, S.²

42
- 41649099242
- Singing voice separation from monaural recordings
- Y. Li and D. L.Wang, "Singing voice separation from monaural recordings," in Proc. Int. Symp. Music Inf. Retrieval (ISMIR'06), 2006, pp. 176-179.
- (2006) Proc. Int. Symp. Music Inf. Retrieval (ISMIR'06) , pp. 176-179
- Li, Y.¹ Wang, D.L.²

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.