SCOPUS 정보 검색 플랫폼

IEEE Transactions on Audio, Speech and Language Processing

Volumn 19, Issue 7, 2011, Pages 2067-2080

Exemplar-based sparse representations for noise robust automatic speech recognition

(3) Gemmeke, Jort F a Virtanen, Tuomas b Hurmalainen, Antti b

a RADBOUD UNIVERSITY NIJMEGEN (Netherlands)

b TAMPERE UNIVERSITY OF TECHNOLOGY (Finland)

Author keywords

Exemplar based; noise robustness; non negative matrix factorization; sparse representations; speech recognition

Indexed keywords

CONNECTED DIGITS; EXEMPLAR-BASED; FEATURE ENHANCEMENT; LINEAR COMBINATIONS; MISSING DATA; NOISE ROBUSTNESS; NOISE-ROBUST AUTOMATIC SPEECH RECOGNITION; NOISY SPEECH; NONNEGATIVE MATRIX FACTORIZATION; PHONETIC INFORMATION; SIGNAL TO NOISE; SOURCE SEPARATION; SPARSE REPRESENTATION; TIME FRAME; TIME FREQUENCY;

BLIND SOURCE SEPARATION; FACTORIZATION; FEATURE EXTRACTION; HYBRID SYSTEMS; SIGNAL TO NOISE RATIO;

SPEECH RECOGNITION;

EID: 79960657803 PISSN: 15587916 EISSN: None Source Type: Journal
DOI: 10.1109/TASL.2011.2112350 Document Type: Article

Times cited : (298)

References (47)

1
- 0030142722
- Towards increasing speech recognition error rates
- DOI 10.1016/0167-6393(96)00003-9, PII S0167639396000039
- H. Bourlard, H. Hermansky, and N. Morgan, "Towards increasing speech recognition error rates," Speech Commun., vol. 18, pp. 205-231, 1996. (Pubitemid 126362800)
- (1996) Speech Communication , vol.18 , Issue.3 , pp. 205-231
- Bourlard, H.¹ Hermansky, H.² Morgan, N.³

2
- 0029725301
- A vector Taylor series approach for environment-independent speech recognition
- Atlanta, GA
- P. Moreno, B. Raj, and R. Stern, "A vector Taylor series approach for environment-independent speech recognition," in Proc. Int. Conf. Audio, Speech, Signal Process., Atlanta, GA, 1996, pp. 733-736.
- (1996) Proc. Int. Conf. Audio, Speech, Signal Process. , pp. 733-736
- Moreno, P.¹ Raj, B.² Stern, R.³

3
- 0030245128
- Robust continuous speech recognition using parallel model combination
- PII S1063667696067120
- M. J. F. Gales and S. J. Young, "Robust continuous speech recognition using parallel model combination," IEEE Trans. Speech Audio Process., vol. 4, no. 5, pp. 352-359, Sep. 1996. (Pubitemid 126753023)
- (1996) IEEE Transactions on Speech and Audio Processing , vol.4 , Issue.5 , pp. 352-359
- Gales, M.J.F.¹ Young, S.J.²

4
- 85032752225
- Missing-feature approaches in speech recognition
- DOI 10.1109/MSP.2005.1511828
- B. Raj and R. M. Stern, "Missing-feature approaches in speech recognition," IEEE Signal Process. Mag., vol. 22, no. 5, pp. 101-116, Sep. 2005. (Pubitemid 41488524)
- (2005) IEEE Signal Processing Magazine , vol.22 , Issue.5 , pp. 101-116
- Raj, B.¹ Stern, R.M.²

5
- 84869001637
- Handling missing data in speech recognition
- M. Cooke, P. Green, and M. Crawford, "Handling missing data in speech recognition," in Proc. Int. Conf. Speech Lang. Process., 1994, pp. 1555-1558.
- (1994) Proc. Int. Conf. Speech Lang. Process. , pp. 1555-1558
- Cooke, M.¹ Green, P.² Crawford, M.³

6
- 0030635327
- Application of sequential estimation to time-varying environment compensation in speech recognition
- N. S. Kim, D. K. Kim, and S. R. Kim, "Application of sequential estimation to time-varying environment compensation in speech recognition," in IEEE Workshop Autom. Speech Recognition Understanding, 1997, pp. 389-395.
- (1997) IEEE Workshop Autom. Speech Recognition Understanding , pp. 389-395
- Kim, N.S.¹ Kim, D.K.² Kim, S.R.³

7
- 85009074657
- ALGONQUIN: Iterating laplace's method to remove multiple types of acoustic distortion for robust speech recognition
- B. J. Frey, L. Deng, A. Acero, and T. Kristjansson, "ALGONQUIN: Iterating laplace's method to remove multiple types of acoustic distortion for robust speech recognition," in Proc. Eurospeech, 2001, pp. 901-904.
- (2001) Proc. Eurospeech , pp. 901-904
- Frey, B.J.¹ Deng, L.² Acero, A.³ Kristjansson, T.⁴

8
- 84898993440
- Sequential noise compensation by sequential Monte Carlo method
- K. Yao and S. Nakamura, "Sequential noise compensation by sequential Monte Carlo method," in Proc. Neural Inf. Process. Syst., 2002, pp. 1205-1212.
- (2002) Proc. Neural Inf. Process. Syst. , pp. 1205-1212
- Yao, K.¹ Nakamura, S.²

9
- 84898964201
- Algorithms for non-negative matrix factorization
- Apr
- D. D. Lee and H. S. Seung, "Algorithms for non-negative matrix factorization," in Proc. Neural Inf. Process. Syst., Apr. 2001, pp. 556-562.
- (2001) Proc. Neural Inf. Process. Syst. , pp. 556-562
- Lee, D.D.¹ Seung, H.S.²

10
- 85032750937
- An introduction to compressive sampling
- Mar
- E. J. Candés and M. B. Wakin, "An introduction to compressive sampling," IEEE Signal Process. Mag., vol. 25, no. 2, pp. 21-30, Mar. 2008.
- (2008) IEEE Signal Process. Mag. , vol.25 , Issue.2 , pp. 21-30
- Candés, E.J.¹ Wakin, M.B.²

11
- 84945116938
- Non-negative matrix factorization for polyphonic music transcription
- P. Smaragdis and J. C. Brown, "Non-negative matrix factorization for polyphonic music transcription," in Proc. IEEE Workshop Applicat. Signal Process. Audio Acoust., 2003, pp. 177-180.
- (2003) Proc. IEEE Workshop Applicat. Signal Process. Audio Acoust. , pp. 177-180
- Smaragdis, P.¹ Brown, J.C.²

12
- 50249152311
- Monaural sound source separation by non-negative matrix factorization with temporal continuity and sparseness criteria
- Mar
- T. Virtanen, "Monaural sound source separation by non-negative matrix factorization with temporal continuity and sparseness criteria," IEEE Trans. Audio, Speech, Lang. Process., vol. 15, no. 3, pp. 1066-1074, Mar. 2007.
- (2007) IEEE Trans. Audio, Speech, Lang. Process. , vol.15 , Issue.3 , pp. 1066-1074
- Virtanen, T.¹

13
- 50249173994
- Linear regression on sparse features for single-channel speech separation
- M. N. Schmidt and R. K. Olsson, "Linear regression on sparse features for single-channel speech separation," in Proc. IEEE Workshop Applicat. Signal Process. Audio Acoust. (WASPAA), 2007, pp. 26-29.
- (2007) Proc. IEEE Workshop Applicat. Signal Process. Audio Acoust. (WASPAA) , pp. 26-29
- Schmidt, M.N.¹ Olsson, R.K.²

14
- 67149096066
- Mixtures of gamma priors for nonnegative matrix factorization based speech separation
- T. Virtanen and A. T. Cemgil, "Mixtures of gamma priors for nonnegative matrix factorization based speech separation," in Proc. ICA, 2009, pp. 646-653.
- (2009) Proc. ICA , pp. 646-653
- Virtanen, T.¹ Cemgil, A.T.²

15
- 79959818117
- Non-negative matrix factorization based compensation of music for automatic speech recognition
- B. Raj, T. Virtanen, S. Chaudhure, and R. Singh, "Non-negative matrix factorization based compensation of music for automatic speech recognition," in Proc. Int. Conf. Speech, Lang. Process., 2010, pp. 717-720.
- Proc. Int. Conf. Speech, Lang. Process. , vol.2010 , pp. 717-720
- Raj, B.¹ Virtanen, T.² Chaudhure, S.³ Singh, R.⁴

16
- 61549128441
- Robust face recognition via sparse representation
- Feb
- J. Wright, A. Y. Yang, A. Ganesh, S. S. Sastry, and Y. Ma, "Robust face recognition via sparse representation," IEEE Trans. Pattern Anal. Mach. Intell., vol. 31, no. 2, pp. 210-227, Feb. 2009.
- (2009) IEEE Trans. Pattern Anal. Mach. Intell. , vol.31 , Issue.2 , pp. 210-227
- Wright, J.¹ Yang, A.Y.² Ganesh, A.³ Sastry, S.S.⁴ Ma, Y.⁵

17
- 78049392891
- Bayesian compressive sensing for phonetic classification
- T. N. Sainath, A. Carmi, D. Kanevsky, and B. Ramabhadran, "Bayesian compressive sensing for phonetic classification," in Proc. Int. Conf. Audio, Speech, Signal Process., 2010, pp. 4370-4373.
- Proc. Int. Conf. Audio, Speech, Signal Process. , vol.2010 , pp. 4370-4373
- Sainath, T.N.¹ Carmi, A.² Kanevsky, D.³ Ramabhadran, B.⁴

18
- 84863733079
- Using sparse representations for exemplar based continuous digit recognition
- Glasgow, Scotland, Aug. 24-28
- J. F. Gemmeke, L. ten Bosch, L. Boves, and B. Cranen, "Using sparse representations for exemplar based continuous digit recognition," in Proc. EUSIPCO, Glasgow, Scotland, Aug. 24-28, 2009, pp. 1755-1759.
- (2009) Proc. EUSIPCO , pp. 1755-1759
- Gemmeke, J.F.¹ Ten Bosch, L.² Boves, L.³ Cranen, B.⁴

19
- 78049412911
- Noise robust exemplar-based connected digit recognition
- J. F. Gemmeke and T. Virtanen, "Noise robust exemplar-based connected digit recognition," in Proc. Int. Conf. Audio, Speech, Signal Process., 2010, pp. 4546-4549.
- Proc. Int. Conf. Audio, Speech, Signal Process. , vol.2010 , pp. 4546-4549
- Gemmeke, J.F.¹ Virtanen, T.²

20
- 79960693807
- Noise robust digit recognition using sparse representations
- J. F. Gemmeke and B. Cranen, "Noise robust digit recognition using sparse representations," in Proc. ISCA 2008 ITRW Speech Anal. Process. Knowl. Discov., 2008.
- (2008) Proc. ISCA 2008 ITRW Speech Anal. Process. Knowl. Discov.
- Gemmeke, J.F.¹ Cranen, B.²

21
- 78049398611
- Sparse coding for speech recognition
- G. S. V. S. Sivaram, S. K. Nemala, M. Elhilali, T. D. Tran, and H. Hermansky, "Sparse coding for speech recognition," in Proc. Int. Conf. Audio, Speech, Signal Process., 2010, pp. 4346-4349.
- Proc. Int. Conf. Audio, Speech, Signal Process. , vol.2010 , pp. 4346-4349
- Sivaram, G.S.V.S.¹ Nemala, S.K.² Elhilali, M.³ Tran, T.D.⁴ Hermansky, H.⁵

22
- 77949695902
- Compressive sensing for missing data imputation in noise robust speech recognition
- Apr.
- J. F. Gemmeke, H. Van Hamme, B. Cranen, and L. Boves, "Compressive sensing for missing data imputation in noise robust speech recognition," IEEE J. Sel. Topics Signal Process., vol. 4, no. 2, pp. 272-287, Apr. 2010.
- (2010) IEEE J. Sel. Topics Signal Process. , vol.4 , Issue.2 , pp. 272-287
- Gemmeke, J.F.¹ Van Hamme, H.² Cranen, B.³ Boves, L.⁴

23
- 45549086638
- Template based continuous speech recognition
- May
- M. D. Wachter, M. Matton, K. Demuynck, P. Wambacq, R. Cools, and D. Van Compernolle, "Template based continuous speech recognition," IEEE Trans. Audio, Speech, Lang. Process., vol. 15, no. 4, pp. 1377-1390, May 2007.
- (2007) IEEE Trans. Audio, Speech, Lang. Process. , vol.15 , Issue.4 , pp. 1377-1390
- Wachter, M.D.¹ Matton, M.² Demuynck, K.³ Wambacq, P.⁴ Cools, R.⁵ Van Compernolle, D.⁶

24
- 70450188400
- Applying non-negative matrix factorization on time-frequency reassignment spectra for missing data mask estimation
- Brighton, U.K., Sep. 6-10
- M. Van Segbroeck and H. Van Hamme, "Applying non-negative matrix factorization on time-frequency reassignment spectra for missing data mask estimation," in Proc. Interspeech, Brighton, U.K., Sep. 6-10, 2009, pp. 2511-2514.
- (2009) Proc. Interspeech , pp. 2511-2514
- Van Segbroeck, M.¹ Van Hamme, H.²

25
- 4544315110
- Robust speech recognition using cepstral domain missing data techniques and noisy masks
- H. Van Hamme, "Robust speech recognition using cepstral domain missing data techniques and noisy masks," in Proc. Int. Conf. Audio, Speech, Signal Process., 2004, vol. 1, pp. 213-216.
- (2004) Proc. Int. Conf. Audio, Speech, Signal Process. , vol.1 , pp. 213-216
- Van Hamme, H.¹

26
- 51449111646
- Bayesian extensions to nonnegative matrix factorization for audio signal modelling
- T. Virtanen, A. T. Cemgil, and S. Godsill, "Bayesian extensions to nonnegative matrix factorization for audio signal modelling," in Proc. Int. Conf. Audio, Speech, Signal Process., 2008, pp. 1825-1828.
- (2008) Proc. Int. Conf. Audio, Speech, Signal Process. , pp. 1825-1828
- Virtanen, T.¹ Cemgil, A.T.² Godsill, S.³

27
- 79959837544
- State-based labeling for a sparse representation of speech and its application to robust speech recognition
- T. Virtanen, J. F. Gemmeke, and A. Hurmalainen, "State-based labeling for a sparse representation of speech and its application to robust speech recognition," in Proc. Interspeech, 2010, pp. 893-896.
- Proc. Interspeech , vol.2010 , pp. 893-896
- Virtanen, T.¹ Gemmeke, J.F.² Hurmalainen, A.³

28
- 84858719009
- A sparse non-parametric approach for single channel separation of known sounds
- P. Smaragdis, M. Shashanka, and B. Raj, "A sparse non-parametric approach for single channel separation of known sounds," in Proc. Neural Inf. Process. Syst., 2009, pp. 1705-1713.
- (2009) Proc. Neural Inf. Process. Syst. , pp. 1705-1713
- Smaragdis, P.¹ Shashanka, M.² Raj, B.³

29
- 33744968614
- Audio source separation with a single sensor
- DOI 10.1109/TSA.2005.854110
- L. Benaroya, F. Bimbot, and R. Gribonval, "Audio source separation with a single sensor," IEEE Trans. Audio, Speech, Lang. Process., vol. 14, no. 1, pp. 191-199, Jan. 2006. (Pubitemid 43863465)
- (2006) IEEE Transactions on Audio, Speech and Language Processing , vol.14 , Issue.1 , pp. 191-199
- Benaroya, L.¹ Bimbot, F.² Gribonval, R.³

30
- 85128375707
- Inference of missing spectrographic features for robust automatic speech recognition
- Sydney, Australia, Nov. 4
- B. Raj, R. Singh, and R. Stern, "Inference of missing spectrographic features for robust automatic speech recognition," in Proc. Int. Conf. Speech Lang. Process., Sydney, Australia, Nov. 4, 1998, pp. 1491-1494.
- (1998) Proc. Int. Conf. Speech Lang. Process. , pp. 1491-1494
- Raj, B.¹ Singh, R.² Stern, R.³

31
- 85009128803
- Prospect features and their application to missing data techniques for robust speech recognition
- H.Van Hamme, "Prospect features and their application to missing data techniques for robust speech recognition," in Proc. Interspeech, 2004, pp. 101-104.
- (2004) Proc. Interspeech , pp. 101-104
- Van Hamme, H.¹

32
- 42549139762
- MVA processing of speech features
- Jan
- C.-P. Chen and J. A. Bilmes, "MVA processing of speech features," IEEE Trans. Audio, Speech, Lang. Process., vol. 15, no. 1, pp. 257-270, Jan. 2007.
- (2007) IEEE Trans. Audio, Speech, Lang. Process. , vol.15 , Issue.1 , pp. 257-270
- Chen, C.-P.¹ Bilmes, J.A.²

33
- 0038669544
- The Aurora experimental framework for the performance evaluation of speech recognition systems under noisy conditions
- Paris, France, Sep. 18-20
- H. Hirsch and D. Pearce, "The Aurora experimental framework for the performance evaluation of speech recognition systems under noisy conditions," in Proc. ISCA Tutorial Research Workshop ASR2000, Paris, France, Sep. 18-20, 2000, pp. 181-188.
- (2000) Proc. ISCA Tutorial Research Workshop ASR2000 , pp. 181-188
- Hirsch, H.¹ Pearce, D.²

34
- 33947622695
- Handling time-derivative features in a missing data framework for robust automatic speech recognition
- H. Van Hamme, "Handling time-derivative features in a missing data framework for robust automatic speech recognition," in Proc. Int. Conf. Audio, Speech, Signal Process., 2006, pp. 293-296.
- (2006) Proc. Int. Conf. Audio, Speech, Signal Process. , pp. 293-296
- Van Hamme, H.¹

35
- 79959834868
- Artificial and online acquired noise dictionaries for noise robust ASR
- J. F. Gemmeke and T. Virtanen, "Artificial and online acquired noise dictionaries for noise robust ASR," in Proc. Interspeech, 2010, pp. 2082-2085.
- Proc. Interspeech , vol.2010 , pp. 2082-2085
- Gemmeke, J.F.¹ Virtanen, T.²

36
- 85009113852
- HMM adaptation using vector Taylor series for noise speech recognition
- Beijing, China
- A. Acero, L. Deng, T. Kristjansson, and J. Zhang, "HMM adaptation using vector Taylor series for noise speech recognition," in Proc. Int. Conf. Spoken Lang. Process., Beijing, China, 2000, pp. 869-872.
- (2000) Proc. Int. Conf. Spoken Lang. Process. , pp. 869-872
- Acero, A.¹ Deng, L.² Kristjansson, T.³ Zhang, J.⁴

37
- 70450179002
- Transforming features to compensate speech recognizer models for noise
- Brighton, U.K., Sep. 6-10
- R. C. Van Dalen, F. Flego, and M. J. F. Gales, "Transforming features to compensate speech recognizer models for noise," in Proc. Interspeech, Brighton, U.K., Sep. 6-10, 2009, pp. 2499-2502.
- (2009) Proc. Interspeech , pp. 2499-2502
- Van Dalen, R.C.¹ Flego, F.² Gales, M.J.F.³

38
- 79959825120
- Using a DBN to integrate sparse classification and GMM-based ASR
- DOI:10.1016/j.csl.2010.06. 004
- Y. Sun, J. F. Gemmeke, B. Cranen, L. ten Bosch, and L. Boves, "Using a DBN to integrate sparse classification and GMM-based ASR," in Proc. Interspeech, 2010, pp. 2098-2101, DOI:10.1016/j.csl.2010.06. 004.
- Proc. Interspeech , vol.2010 , pp. 2098-2101
- Sun, Y.¹ Gemmeke, J.F.² Cranen, B.³ Ten Bosch, L.⁴ Boves, L.⁵

39
- 78049527664
- Sparse imputation for large vocabulary noise robust ASR
- J. F. Gemmeke, B. Cranen, and U. Remes, "Sparse imputation for large vocabulary noise robust ASR," Comput. Speech Lang., pp. 462-479, 2010.
- (2010) Comput. Speech Lang. , pp. 462-479
- Gemmeke, J.F.¹ Cranen, B.² Remes, U.³

40
- 78049409668
- Fast GPU implementation of large scale dictionary and sparse representation based vision problems
- P. Nagesh, R. Gowda, and B. Li, "Fast GPU implementation of large scale dictionary and sparse representation based vision problems," in Proc. Int. Conf. Audio, Speech, Signal Process., 2010, pp. 1570-1573.
- Proc. Int. Conf. Audio, Speech, Signal Process. , vol.2010 , pp. 1570-1573
- Nagesh, P.¹ Gowda, R.² Li, B.³

41
- 67651030071
- Unsupervised learning of timefrequency patches as a noise-robust representation of speech
- M. Van Segboeck and H. Van Hamme, "Unsupervised learning of timefrequency patches as a noise-robust representation of speech," Speech Commun., vol. 51, no. 11, 2009.
- (2009) Speech Commun. , vol.51 , Issue.11
- Van Segboeck, M.¹ Van Hamme, H.²

42
- 70349192993
- Classification via group sparsity promoting regularization
- A. Majumdar and R. K. Ward, "Classification via group sparsity promoting regularization," in Proc. Int. Conf. Audio, Speech, Signal Process., 2009, pp. 861-864.
- (2009) Proc. Int. Conf. Audio, Speech, Signal Process. , pp. 861-864
- Majumdar, A.¹ Ward, R.K.²

43
- 0033692739
- Feature extraction using non-linear transformation for robust speech recognition on the Aurora database
- S. Sharma, D. Ellis, S. Kajarekar, P. Jain, and H. Hermansky, "Feature extraction using non-linear transformation for robust speech recognition on the Aurora database," in Proc. Int. Conf. Audio, Speech, Signal Process., 2000, pp. 1117-1120.
- (2000) Proc. Int. Conf. Audio, Speech, Signal Process. , pp. 1117-1120
- Sharma, S.¹ Ellis, D.² Kajarekar, S.³ Jain, P.⁴ Hermansky, H.⁵

44
- 33750383209
- K-SVD: An algorithm for designing overcomplete dictionaries for sparse representation
- DOI 10.1109/TSP.2006.881199
- M. Aharon, M. Elad, and A. M. Bruckstein, "The K-SVD: An algorithm for designing of overcomplete dictionaries for sparse representations," IEEE Trans. Signal Process., vol. 54, no. 11, pp. 4311-4322, Nov. 2006. (Pubitemid 44637761)
- (2006) IEEE Transactions on Signal Processing , vol.54 , Issue.11 , pp. 4311-4322
- Aharon, M.¹ Elad, M.² Bruckstein, A.³

45
- 84882496252
- Separation of sound sources by convolutive sparse coding
- T. Virtanen, "Separation of sound sources by convolutive sparse coding," in Proc. ISCA Tutorial and Research Workshop Statist. Perceptual Audio Process., 2004.
- (2004) Proc. ISCA Tutorial and Research Workshop Statist. Perceptual Audio Process.
- Virtanen, T.¹

46
- 38049021850
- Convolutive speech bases and their application to supervised speech separation
- Jan
- P. Smaragdis, "Convolutive speech bases and their application to supervised speech separation," IEEE Trans. Audio, Speech, Lang. Process., vol. 15, no. 1, pp. 1-12, Jan. 2007.
- (2007) IEEE Trans. Audio, Speech, Lang. Process. , vol.15 , Issue.1 , pp. 1-12
- Smaragdis, P.¹

47
- 36348966695
- On the convergence of multiplicative update algorithms for nonnegative matrix factorization
- DOI 10.1109/TNN.2007.895831
- C.-J. Lin, "On the convergence of multiplicative update algorithms for nonnegative matrix factorization," IEEE Trans. Neural Netw., vol. 18, no. 6, pp. 1589-1596, Nov. 2007. (Pubitemid 350148414)
- (2007) IEEE Transactions on Neural Networks , vol.18 , Issue.6 , pp. 1589-1596
- Lin, C.-J.¹

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.