SCOPUS 정보 검색 플랫폼

IEEE Transactions on Audio, Speech and Language Processing

Volumn 19, Issue 2, 2011, Pages 242-255

Source-filter-based single-channel speech separation using pitch information

(3) Stark, Michael a Wohlmayr, Michael a Pernkopf, Franz a

a GRAZ UNIVERSITY OF TECHNOLOGY (Austria)

Author keywords

multi pitch estimation; Single channel speech separation (SCSS); sourcefilter representation

Indexed keywords

FAST APPROXIMATION; FILTER MODEL; FILTER-BASED; GAIN ESTIMATION; LIKELIHOOD COMPUTATION; LINEAR RELATIONSHIPS; MODEL-DRIVEN METHOD; NONNEGATIVE MATRIX FACTORIZATION; PITCH ESTIMATION; PITCH-TRACKING; SINGLE-CHANNEL; SOURCE SEPARATION; SOURCEFILTER REPRESENTATION; SPEECH SEPARATION; VOCAL-TRACTS;

ACOUSTIC VARIABLES MEASUREMENT; BLIND SOURCE SEPARATION; ESTIMATION; FACTORIZATION; SPEECH ANALYSIS; VECTOR QUANTIZATION;

CONTINUOUS SPEECH RECOGNITION;

EID: 78049306672 PISSN: 15587916 EISSN: None Source Type: Journal
DOI: 10.1109/TASL.2010.2047419 Document Type: Article

Times cited : (64)

References (34)

1
- 82255178542
- ser. IEEE Press, D. Wang and G. J. Brown, Eds. New York: Wiley
- Computational Auditory Scene Analysis: Principles, Algorithms, and Applications, ser. IEEE Press, D. Wang and G. J. Brown, Eds. New York: Wiley, 2006.
- (2006) Computational Auditory Scene Analysis: Principles, Algorithms, and Applications

2
- 4644265990
- Monaural speech segregation based on pitch tracking and amplitude modulation
- Sep
- G. Hu and D. Wang, "Monaural speech segregation based on pitch tracking and amplitude modulation", IEEE Trans. Neural Netw., vol. 15, no. 5, pp. 1135-1150, Sep. 2004.
- (2004) IEEE Trans. Neural. Netw. , vol.15 , Issue.5 , pp. 1135-1150
- Hu, G.¹ Wang, D.²

3
- 0032682770
- Separation of speech from interfering sounds based on oscillatory correlation
- May
- D. L. Wang and G. J. Brown, "Separation of speech from interfering sounds based on oscillatory correlation", IEEE Trans. Neural Netw., vol. 10, no. 3, pp. 684-697, May 1999.
- (1999) IEEE Trans. Neural. Netw. , vol.10 , Issue.3 , pp. 684-697
- Wang, D.L.¹ Brown, G.J.²

4
- 0035681924
- Speech segregation based on pitch tracking and amplitude modulation
- G. Hu and D. Wang, "Speech segregation based on pitch tracking and amplitude modulation", in Proc. IEEE Workshop Applicat. Signal Process. to Audio Acoust., 2001, pp. 79-82.
- (2001) Proc. IEEE Workshop Applicat. Signal Process. to Audio Acoust. , pp. 79-82
- Hu, G.¹ Wang, D.²

5
- 84892233308
- On ideal binary mask as the computational goal of auditory scene analysis
- 1st ed. New York: Springer, Nov
- D. Wang, "On ideal binary mask as the computational goal of auditory scene analysis", in Speech Separation by Humans and Machines, 1st ed. New York: Springer, Nov. 2005, p. 319.
- (2005) Speech Separation by Humans and Machines , pp. 319
- Wang, D.¹

6
- 0024753593
- Speech recognition using noise-adaptive prototypes
- Oct
- A. Nadas, D. Nahamoo, and M. A. Picheny, "Speech recognition using noise-adaptive prototypes", IEEE Trans. Acoust. Speech Signal Process., vol. 37, no. 10, pp. 1495-1503, Oct. 1989.
- (1989) IEEE Trans. Acoust. Speech Signal Process. , vol.37 , Issue.10 , pp. 1495-1503
- Nadas, A.¹ Nahamoo, D.² Picheny, M.A.³

7
- 85009230793
- Factorial models and refiltering for speech separation and denoising
- sep
- S. T. Roweis, "Factorial models and refiltering for speech separation and denoising", in Proc. Eurospeech, Sep. 2003, pp. 1009-1012.
- (2003) Proc. Eurospeech , pp. 1009-1012
- Roweis, S.T.¹

8
- 0038705102
- One microphone source separation
- S. T. Roweis, "One microphone source separation", Neural Inf. Process. Syst., vol. 13, pp. 793-799, 2000.
- (2000) Neural. Inf. Process. Syst. , vol.13 , pp. 793-799
- Roweis, S.T.¹

9
- 0033592606
- Learning the parts of objects by nonnegative matrix factorization
- D. D. Lee and H. S. Seung, "Learning the parts of objects by nonnegative matrix factorization", Nature, vol. 401, p. 788, 1999.
- (1999) Nature , vol.401 , pp. 788
- Lee, D.D.¹ Seung, H.S.²

10
- 84945116938
- Non-negative matrix factorization for polyphonic music transcription
- P. Smaragdis and J. Brown, "Non-negative matrix factorization for polyphonic music transcription", in Proc. IEEE Workshop Applicat. Signal Process. to Audio Acoust., 2003, pp. 177-180.
- (2003) Proc. IEEE Workshop Applicat. Signal Process. to Audio Acoust. , pp. 177-180
- Smaragdis, P.¹ Brown, J.²

11
- 44949258898
- Super-human multi-talker speech recognition: The IBM 2006 speech separation challenge system
- T. Kristjansson, J. Hershey, P. Olsen, S. Rennie, and R. Gopinath, "Super-human multi-talker speech recognition: The IBM 2006 speech separation challenge system", in Proc. Interspeech, 2006, no. 1775.
- (2006) Proc. Interspeech , Issue.1775
- Kristjansson, T.¹ Hershey, J.² Olsen, P.³ Rennie, S.⁴ Gopinath, R.⁵

12
- 33750368310
- An audiovisual corpus for speech perception and automatic speech recognition
- M. P. Cooke, J. Barker, S. P. Cunningham, and X. Shao, "An audiovisual corpus for speech perception and automatic speech recognition", J. Acoust. Soc. Amer., vol. 120, no. 5, pp. 2421-2424, 2006.
- (2006) J. Acoust. Soc. Amer. , vol.120 , Issue.5 , pp. 2421-2424
- Cooke, M.P.¹ Barker, J.² Cunningham, S.P.³ Shao, X.⁴

13
- 33845940172
- A maximum likelihood estimation of vocal-tract-related filter characteristics for single channel speech separation
- M. H. Radfar, R. M. Dansereau, and A. Sayadiyan, "A maximum likelihood estimation of vocal-tract-related filter characteristics for single channel speech separation", J. Audio, Speech, Music Process., vol. 1, p. 15, 2007.
- (2007) J. Audio, Speech, Music Process. , vol.1 , pp. 15
- Radfar, M.H.¹ Dansereau, R.M.² Sayadiyan, A.³

14
- 0037767686
- A multipitch tracking algorithm for noisy speech
- Mar
- M. Wu, D. Wang, and G. Brown, "A multipitch tracking algorithm for noisy speech", IEEE Trans. Speech Audio Process, vol. 11, no. 3, pp. 229-241, Mar. 2003.
- (2003) IEEE Trans. Speech Audio Process , vol.11 , Issue.3 , pp. 229-241
- Wu, M.¹ Wang, D.² Brown, G.³

15
- 0030846123
- A unitary model of pitch perception
- R. Meddis and L. O'Mard, "A unitary model of pitch perception", J. Acoust Soc. Amer., vol. 102, no. 3, pp. 1811-1820, 1997.
- (1997) J. Acoust Soc. Amer. , vol.102 , Issue.3 , pp. 1811-1820
- Meddis, R.¹ O'Mard, L.²

16
- 0031268341
- Factorial hidden Markov models
- Z. Ghahramani and M. Jordan, "Factorial hidden Markov models", Mach. Learn., vol. 29, no. 2-3, pp. 245-273, 1997.
- (1997) Mach. Learn. , vol.29 , Issue.2-3 , pp. 245-273
- Ghahramani, Z.¹ Jordan, M.²

17
- 33646773610
- Discriminative training of hidden Markov models for multiple pitch tracking
- F. Bach and M. Jordan, "Discriminative training of hidden Markov models for multiple pitch tracking", in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process., 2005, pp. 489-492.
- (2005) Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. , pp. 489-492
- Bach, F.¹ Jordan, M.²

18
- 84867209792
- Multipitch tracking using a factorial hidden Markov model
- M. Wohlmayr and F. Pernkopf, "Multipitch tracking using a factorial hidden Markov model", in Proc. Interspeech, 2008.
- (2008) Proc. Interspeech
- Wohlmayr, M.¹ Pernkopf, F.²

19
- 0001455934
- A robust algorithm for pitch tracking
- Amsterdam, The Netherlands: Elsevier
- D. Talkin, "A robust algorithm for pitch tracking", in Speech Coding and Synthesis. Amsterdam, The Netherlands: Elsevier, 1995, pp. 495-518.
- (1995) Speech Coding and Synthesis , pp. 495-518
- Talkin, D.¹

20
- 24344483148
- Genetic-based EM algorithm for learning Gaussian mixture models
- Aug
- F. Pernkopf and D. Bouchaffra, "Genetic-based EM algorithm for learning Gaussian mixture models", IEEE Trans. Pattern Anal Mach. Intell., vol. 27, no. 8, pp. 1344-1348, Aug. 2005.
- (2005) IEEE Trans. Pattern Anal. Mach. Intell. , vol.27 , Issue.8 , pp. 1344-1348
- Pernkopf, F.¹ Bouchaffra, D.²

21
- 70450177302
- Finite mixture spectrogram modeling for multipitch tracking using a factorial hidden Markov model
- M. Wohlmayr and F. Pernkopf, "Finite mixture spectrogram modeling for multipitch tracking using a factorial hidden Markov model", in Proc. Interspeech, 2009.
- (2009) Proc. Interspeech
- Wohlmayr, M.¹ Pernkopf, F.²

22
- 0003786003
- Cambridge, MA: MIT Press
- F. Jelinek Statistical Methods for Speech Recognition. Cambridge, MA: MIT Press, 1998.
- (1998) Statistical Methods for Speech Recognition
- Jelinek, F.¹

23
- 0035246564
- Factor graphs and the sum-product algorithm
- Feb
- F. Kschischang, B. Frey, and H.-A. Loeliger, "Factor graphs and the sum-product algorithm", IEEE Trans. Inf. Theory, vol. 47, no. 2, pp. 498-519, Feb. 2001.
- (2001) IEEE Trans. Inf. Theory , vol.47 , Issue.2 , pp. 498-519
- Kschischang, F.¹ Frey, B.² Loeliger, H.-A.³

24
- 0002629270
- Maximum likelihood estimation from incomplete data via the EM algorithm
- A. Dempster, N. Laird, and D. Rubin, "Maximum likelihood estimation from incomplete data via the EM algorithm", J. R. Statist. Soc, vol. B39, no. B, pp. 1-38, 1977.
- (1977) J. R. Statist. Soc. , vol.B39 , Issue.B , pp. 1-38
- Dempster, A.¹ Laird, N.² Rubin, D.³

25
- 0004283231
- Cambridge, MA: MIT Press
- M. Jordan, Learning in Graphical Models. Cambridge, MA: MIT Press, 1999.
- (1999) Learning in Graphical Models
- Jordan, M.¹

26
- 33947651485
- Divergence measures and message passing
- T. Minka, "Divergence measures and message passing", Microsoft Research Cambridge, Tech. Rep. MSR-TR-2005-173, 2005.
- (2005) Microsoft Research Cambridge, Tech. Rep. MSR-TR-2005-173
- Minka, T.¹

27
- 0027268967
- HNS: Speech modification based on a harmonic+noise model
- Apr. 27-30, IEEE
- J. Laroche, Y. Stylianou, and E. Moulines, "HNS: Speech modification based on a harmonic+noise model", in IEEE Int. Conf. Acoust., Speech, Signal Process., Apr. 27-30, 1993, vol. 2, pp. 550-553, IEEE.
- (1993) IEEE Int. Conf. Acoust., Speech, Signal Process. , vol.2 , pp. 550-553
- Laroche, J.¹ Stylianou, Y.² Moulines, E.³

28
- 84889302357
- New York: Wiley, Mar
- P. Vary and R. Martin, Digital Speech Transmission, Enhancement, Coding and Error Concealment. New York: Wiley, Mar. 2006.
- (2006) Digital Speech Transmission, Enhancement, Coding and Error Concealment
- Vary, P.¹ Martin, R.²

29
- 0001935942
- Berlin, Germany: Elsevier, ch. 4, Sinusoidal Coding
- R. McAulay and T. Quatieri, Speech Coding and Synthesis. Berlin, Germany: Elsevier, 1995, ch. 4, pp. 121-173, Sinusoidal Coding.
- (1995) Speech Coding and Synthesis , pp. 121-173
- McAulay, R.¹ Quatieri, T.²

30
- 34447100796
- 1st ed. Boca Raton, FL: CRC
- P. C. Loizou, Speech Enhancement: Theory and Practice, 1st ed. Boca Raton, FL: CRC, 2007.
- (2007) Speech Enhancement: Theory and Practice
- Loizou, P.C.¹

31
- 38049021850
- Convolutive speech bases and their application to supervised speech separation
- Jan
- P. Smaragdis, "Convolutive speech bases and their application to supervised speech separation", IEEE Trans. Audio Speech Lang. Process., vol. 15, no. 1, pp. 1-12, Jan. 2007.
- (2007) IEEE Trans. Audio Speech Lang. Process. , vol.15 , Issue.1 , pp. 1-12
- Smaragdis, P.¹

32
- 67349134831
- Sequential organization of speech in computational auditory scene analysis
- Aug
- Y. Shao and D. Wang, "Sequential organization of speech in computational auditory scene analysis", Speech Commun., vol. 51, no. 8, pp. 657-667, Aug. 2009.
- (2009) Speech Commun. , vol.51 , Issue.8 , pp. 657-667
- Shao, Y.¹ Wang, D.²

33
- 0027297381
- Vector quantization for the efficient computation of continuous density likelihoods
- E. Bocchieri, "Vector quantization for the efficient computation of continuous density likelihoods", in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process., 1993, vol. 2, pp. 692-695.
- (1993) Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. , vol.2 , pp. 692-695
- Bocchieri, E.¹

34
- 78049277624
- On optimizing the computational complexity for VQ-based single channel source separation
- Dallas, TX
- M. Stark and F. Pernkopf, "On optimizing the computational complexity for VQ-based single channel source separation", in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process., Dallas, TX, 2010, pp. 237-240.
- (2010) Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. , pp. 237-240
- Stark, M.¹ Pernkopf, F.²

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.