SCOPUS 정보 검색 플랫폼

IEEE Transactions on Audio, Speech and Language Processing

Volumn 21, Issue 3, 2013, Pages 624-635

MMSE-based missing-feature reconstruction with temporal modeling for robust speech recognition

(5) González, José A a Peinado, Antonio M a Ma, Ning b Gómez, Angel M a Barker, Jon c

a UNIVERSITY OF GRANADA (Spain)

b MRC INSTITUTE OF HEARING RESEARCH (United Kingdom)

c UNIVERSITY OF SHEFFIELD (United Kingdom)

Author keywords

Minimum mean square error estimation; missing feature; robust speech recognition; spectral reconstruction

Indexed keywords

BASELINE SYSTEMS; FEATURE COMPENSATION; GAUSSIAN MIXTURE MODEL; LOG-SPECTRAL DOMAIN; MINIMUM MEAN SQUARE ERROR ESTIMATIONS; MISSING-FEATURE; NOISE ROBUST SPEECH RECOGNITION; NOVEL TECHNIQUES; RECOGNITION PERFORMANCE; RECONSTRUCTION TECHNIQUES; ROBUST SPEECH RECOGNITION; SEQUENTIAL STRUCTURE; SPECTRAL RECONSTRUCTION; TEMPORAL CONSTRAINTS; TEMPORAL MODELING; TIME CORRELATIONS; TIME FREQUENCY;

FREQUENCY BANDS; HIDDEN MARKOV MODELS;

SPEECH RECOGNITION;

EID: 84872188748 PISSN: 15587916 EISSN: None Source Type: Journal
DOI: 10.1109/TASL.2012.2229982 Document Type: Article

Times cited : (23)

References (49)

1
- 0029288202
- Speech recognition in noisy environments: A survey
- Apr
- Y. Gong, "Speech recognition in noisy environments: A survey," Speech Commun., vol. 16, no. 3, pp. 261-291, Apr. 1995.
- (1995) Speech Commun , vol.16 , Issue.3 , pp. 261-291
- Gong, Y.¹

2
- 85032751593
- Research developments and directions in speech recognition and understanding, Part 1
- May
- J. M. Baker, L. Deng, J. Glass, S. Khudanpur, C.-H. Lee, N. Morgan, and D. O'Shaughnessy, "Research developments and directions in speech recognition and understanding, Part 1," IEEE Signal Process. Mag., vol. 26, no. 3, pp. 75-80, May 2009.
- (2009) IEEE Signal Process. Mag , vol.26 , Issue.3 , pp. 75-80
- Baker, J.M.¹ Deng, L.² Glass, J.³ Khudanpur, S.⁴ Lee, C.-H.⁵ Morgan, N.⁶ O'Shaughnessy, D.⁷

3
- 85032759066
- UpdatedMINDS report on speech recognition and understanding, Part 2
- Jul
- J. M. Baker, L. Deng, S. Khudanpur, C.-H. Lee, J. Glass, N. Morgan, andD.O'Shaughnessy, "UpdatedMINDS report on speech recognition and understanding, Part 2," IEEE Signal Process. Mag., vol. 26, no. 4, pp. 78-85, Jul. 2009.
- (2009) IEEE Signal Process. Mag , vol.26 , Issue.4 , pp. 78-85
- Baker, J.M.¹ Deng, L.² Khudanpur, S.³ Lee, C.-H.⁴ Glass, J.⁵ Morgan, N.⁶ O'Shaughnessy, D.⁷

4
- 0035342414
- Robust automatic speech recognition with missing and unreliable acoustic data
- DOI 10.1016/S0167-6393(00)00034-0, PII S0167639300000340
- M. Cooke, P. Green, L. Josifovski, and A. Vizinho, "Robust automatic speech recognition with missing and unreliable acoustic data," Speech Commun., vol. 34, no. 3, pp. 267-285, Jun. 2001. (Pubitemid 32284867)
- (2001) Speech Communication , vol.34 , Issue.3 , pp. 267-285
- Cooke, M.¹ Green, P.² Josifovski, L.³ Vizinho, A.⁴

5
- 4644336054
- Reconstruction of missing features for robust speech recognition
- Sep
- B. Raj, M. L. Seltzer, and R. M. Stern, "Reconstruction of missing features for robust speech recognition," Speech Commun., vol. 43, no. 4, pp. 275-296, Sep. 2004.
- (2004) Speech Commun , vol.43 , Issue.4 , pp. 275-296
- Raj, B.¹ Seltzer, M.L.² Stern, R.M.³

6
- 0003549684
- NewYork Van Nostrand
- H. Fletcher, Speech and Hearing in Communication. NewYork:Van Nostrand, 1953.
- (1953) Speech and Hearing in Communication
- Fletcher, H.¹

7
- 0029249228
- Spectral redundancy: Intelligibility of sentences heard through narrow spectral slits
- Feb
- R. Warren, K. Riener, J. Bashford, and B. Brubaker, "Spectral redundancy: Intelligibility of sentences heard through narrow spectral slits," Percept. Psychophys., vol. 57, no. 2, pp. 175-182, Feb. 1995.
- (1995) Percept. Psychophys , vol.57 , Issue.2 , pp. 175-182
- Warren, R.¹ Riener, K.² Bashford, J.³ Brubaker, B.⁴

8
- 33644661135
- A glimpsing model of speech perception in noise
- DOI 10.1121/1.2166600
- M. Cooke, "A glimpsing model of speech perception in noise," J. Acoust. Soc. Amer., vol. 119, no. 3, pp. 1562-1573, Mar. 2006. (Pubitemid 43326025)
- (2006) Journal of the Acoustical Society of America , vol.119 , Issue.3 , pp. 1562-1573
- Cooke, M.¹

9
- 0025681008
- Hidden Markov model decomposition of speech and noise
- A. P. Varga and R. K. Moore, "Hidden Markov model decomposition of speech and noise," in Proc. ICASSP, 1990, pp. 845-848.
- (1990) Proc. ICASSP , pp. 845-848
- Varga, A.P.¹ Moore, R.K.²

10
- 85009063707
- Soft decisions in missing data techniques for robust automatic speech recognition
- J. Barker, L. Josifovski, M. P. Cooke, and P. D. Green, "Soft decisions in missing data techniques for robust automatic speech recognition," in Proc. ICSLP, 2000, pp. 373-376.
- (2000) Proc. ICSLP , pp. 373-376
- Barker, J.¹ Josifovski, L.² Cooke, M.P.³ Green, P.D.⁴

11
- 34748817500
- Exploiting correlogram structure for robust speech recognition with multiple speech sources
- DOI 10.1016/j.specom.2007.05.003, PII S016763930700088X
- N. Ma, P. Green, J. Barker, and A. Coy, "Exploiting correlogram structure for robust speech recognition with multiple speech sources," Speech Commun., vol. 49, no. 12, pp. 874-891, Dec. 2007. (Pubitemid 47488511)
- (2007) Speech Communication , vol.49 , Issue.12 , pp. 874-891
- Ma, N.¹ Green, P.² Barker, J.³ Coy, A.⁴

12
- 80051662480
- A pitch based noise estimation technique for robust speech recognition with missing data
- J. A. Morales-Cordovilla, N. Ma, V. E. Sanchez, J. L. Carmona, A. M. Peinado, and J. Barker, "A pitch based noise estimation technique for robust speech recognition with missing data," in Proc. ICASSP, 2011, pp. 4808-4811.
- (2011) Proc. ICASSP , pp. 4808-4811
- Morales-Cordovilla, J.A.¹ Ma, N.² Sanchez, V.E.³ Carmona, J.L.⁴ Peinado, A.M.⁵ Barker, J.⁶

13
- 4644317224
- A bayesian classifier for spectrographic mask estimation for missing feature speech recognition
- Sep
- M. L. Seltzer, B. Raj, and R. M. Stern, "A bayesian classifier for spectrographic mask estimation for missing feature speech recognition," Speech Commun., vol. 43, no. 4, pp. 379-393, Sep. 2004.
- (2004) Speech Commun , vol.43 , Issue.4 , pp. 379-393
- Seltzer, M.L.¹ Raj, B.² Stern, R.M.³

14
- 78149438339
- A statistical approach to Mel-domain mask estimation for missing-feature ASR
- Nov.
- B. J. Borgström and A. Alwan, "A statistical approach to Mel-domain mask estimation for missing-feature ASR," IEEE Signal Process. Lett., vol. 17, no. 11, pp. 941-944, Nov. 2010.
- (2010) IEEE Signal Process. Lett , vol.17 , Issue.11 , pp. 941-944
- Borgström, B.J.¹ Alwan, A.²

15
- 78649325568
- Mask classification for missing-feature reconstruction for robust speech recognition in unknown background noise
- Jan.
- W. Kim and R. M. Stern, "Mask classification for missing-feature reconstruction for robust speech recognition in unknown background noise," Speech Commun., vol. 53, no. 1, pp. 1-11, Jan. 2011.
- (2011) Speech Commun , vol.53 , Issue.1 , pp. 1-11
- Kim, W.¹ Stern, R.M.²

16
- 79959823862
- Mask estimation in non-stationary noise environments for missing feature based robust speech recognition
- S. Badiezadegan and R. C. Rose, "Mask estimation in non-stationary noise environments for missing feature based robust speech recognition," in Proc. Interspeech, 2010, pp. 2062-2065.
- (2010) Proc. Interspeech , pp. 2062-2065
- Badiezadegan, S.¹ Rose, R.C.²

17
- 33847629729
- On noise masking for automatic missing data speech recognition: A survey and discussion
- DOI 10.1016/j.csl.2006.08.001, PII S0885230806000301
- C. Cerisara, S. Demange, and J.-P. Haton, "On noise masking for automatic missing data speech recognition: A survey and discussion," Comput. Speech Lang., vol. 21, no. 3, pp. 443-457, July 2007. (Pubitemid 46367508)
- (2007) Computer Speech and Language , vol.21 , Issue.3 , pp. 443-457
- Cerisara, C.¹ Demange, S.² Haton, J.-P.³

18
- 11144316019
- Decoding speech in the presence of other sources
- DOI 10.1016/j.specom.2004.05.002, PII S0167639304000615
- J. Barker,M. Cooke, and D. Ellis, "Decoding speech in the presence of other sources," Speech Commun., vol. 45, no. 1, pp. 5-25, Jan. 2005. (Pubitemid 40034706)
- (2005) Speech Communication , vol.45 , Issue.1 , pp. 5-25
- Barker, J.P.¹ Cooke, M.P.² Ellis, D.P.W.³

19
- 84856140165
- Combining speech fragment decoding and adaptive noise floor modelling
- Mar.
- N. Ma, J. Barker, H. Christensen, and P. Green, "Combining speech fragment decoding and adaptive noise floor modelling," IEEE Trans. Audio Speech Lang. Process., vol. 20, no. 3, pp. 818-827, Mar. 2012.
- (2012) IEEE Trans. Audio Speech Lang. Process , vol.20 , Issue.3 , pp. 818-827
- Ma, N.¹ Barker, J.² Christensen, H.³ Green, P.⁴

20
- 85009128803
- PROSPECT features and their application to missing data techniques for robust speech recognition
- H. Van Hamme, "PROSPECT features and their application to missing data techniques for robust speech recognition," in Proc. Interspeech, 2004, pp. 101-104.
- (2004) Proc. Interspeech , pp. 101-104
- Van Hamme, H.¹

21
- 77957739976
- Advances in missing feature techniques for robust large-vocabulary continuous speech recognition
- Jan.
- M. Van Segbroeck and H. Van Hamme, "Advances in missing feature techniques for robust large-vocabulary continuous speech recognition," IEEE Trans. Audio Speech Lang. Process., vol. 19, no. 1, pp. 123-137, Jan. 2011.
- (2011) IEEE Trans. Audio Speech Lang. Process , vol.19 , Issue.1 , pp. 123-137
- Van Segbroeck, M.¹ Van Hamme, H.²

22
- 77956506956
- Missing-feature reconstruction by leveraging temporal spectral correlation for robust speech recognition in background noise conditions
- Nov.
- W. Kim and J. Hansen, "Missing-feature reconstruction by leveraging temporal spectral correlation for robust speech recognition in background noise conditions," IEEE Trans. Audio Speech Lang. Process., vol. 18, no. 8, pp. 2111-2120, Nov. 2010.
- (2010) IEEE Trans. Audio Speech Lang. Process , vol.18 , Issue.8 , pp. 2111-2120
- Kim, W.¹ Hansen, J.²

23
- 33846190246
- Reconstructing spectral vectors with uncertain spectrographic masks for robust speech recognition
- B. Raj and R. Singh, "Reconstructing spectral vectors with uncertain spectrographic masks for robust speech recognition," in Proc. ASRU, 2005, pp. 65-70.
- (2005) Proc. ASRU , pp. 65-70
- Raj, B.¹ Singh, R.²

24
- 77949695902
- Compressive sensing for missing data imputation in noise robust speech recognition
- Apr.
- J. F. Gemmeke, H. Van Hamme, B. Cranen, and L. Boves, "Compressive sensing for missing data imputation in noise robust speech recognition," IEEE J. Sel. Topics Signal Process., vol. 4, no. 2, pp. 272-287, Apr. 2010.
- (2010) IEEE J. Sel. Topics Signal Process , vol.4 , Issue.2 , pp. 272-287
- Gemmeke, J.F.¹ Van Hamme, H.² Cranen, B.³ Boves, L.⁴

25
- 70349198435
- Particle filter based soft-mask estimation for missing feature reconstruction
- F. Faubel, H. Raja, J. McDonough, and D. Klakow, "Particle filter based soft-mask estimation for missing feature reconstruction," in Proc. IWAENC, 2008.
- (2008) Proc. IWAENC
- Faubel, F.¹ Raja, H.² McDonough, J.³ Klakow, D.⁴

26
- 70349226857
- Bounded conditionalmean imputation with Gaussian mixture models: A reconstruction approach to partly occluded features
- F. Faubel, J. McDonough, and D. Klakow, "Bounded conditionalmean imputation with Gaussian mixture models: A reconstruction approach to partly occluded features," in Proc. ICASSP, 2009, pp. 3869-3872.
- (2009) Proc. ICASSP , pp. 3869-3872
- Faubel, F.¹ McDonough, J.² Klakow, D.³

27
- 84867612282
- Combining missing-data reconstruction and uncertainty decoding for robust speech recognition
- J. A. González, A. M. Peinado, A. M. Gómez, N. Ma, and J. Barker, "Combining missing-data reconstruction and uncertainty decoding for robust speech recognition," in Proc. ICASSP, 2012, pp. 4693-4696.
- (2012) Proc. ICASSP , pp. 4693-4696
- González, J.A.¹ Peinado, A.M.² Gómez, A.M.³ Ma, N.⁴ Barker, J.⁵

28
- 0242721421
- HMM-based channel error mitigation and its application to distributed speech recognition
- Nov
- A. M. Peinado, V. Sanchez, J. L. Perez-Cordoba, and A. de la Torre, "HMM-based channel error mitigation and its application to distributed speech recognition," Speech Commun., vol. 41, no. 4, pp. 549-561, Nov. 2003.
- (2003) Speech Commun , vol.41 , Issue.4 , pp. 549-561
- Peinado, A.M.¹ Sanchez, V.² Perez-Cordoba, J.L.³ De La Torre, A.⁴

29
- 85008009592
- EfficientMMSE estimation and uncertainty processing for multienvironment robust speech recognition
- Jul.
- J. A. González, A. M. Peinado, A. M. Gomez, and J. L. Carmona, "EfficientMMSE estimation and uncertainty processing for multienvironment robust speech recognition," IEEE Trans. Audio Speech Lang. Process., vol. 19, no. 5, pp. 1206-1220, Jul. 2011.
- (2011) IEEE Trans. Audio Speech Lang. Process , vol.19 , Issue.5 , pp. 1206-1220
- González, J.A.¹ Peinado, A.M.² Gomez, A.M.³ Carmona, J.L.⁴

30
- 77955829273
- MMSE-based packet loss concealment forCELP-coded speech recognition
- Aug.
- J. L. Carmona, A. M. Peinado, J. L. Perez-Cordoba, and A.M. Gomez, "MMSE-based packet loss concealment forCELP-coded speech recognition," IEEE Trans. Audio Speech Lang. Process., vol. 18, no. 6, pp. 1341-1353, Aug. 2010.
- (2010) IEEE Trans. Audio Speech Lang. Process , vol.18 , Issue.6 , pp. 1341-1353
- Carmona, J.L.¹ Peinado, A.M.² Perez-Cordoba, J.L.³ Gomez, A.M.⁴

31
- 77955777921
- HMM-based reconstruction of unreliable spectrographic data for noise robust speech recognition
- Aug.
- B. J. Borgström and A. Alwan, "HMM-based reconstruction of unreliable spectrographic data for noise robust speech recognition," IEEE Trans. Audio Speech Lang. Process., vol. 18, no. 6, pp. 1612-1623, Aug. 2010.
- (2010) IEEE Trans. Audio Speech Lang. Process , vol.18 , Issue.6 , pp. 1612-1623
- Borgström, B.J.¹ Alwan, A.²

32
- 2442551863
- Estimating cepstrum of speech under the presence of noise using a joint prior of static and dynamic features
- May
- L. Deng, J. Droppo, and A. Acero, "Estimating cepstrum of speech under the presence of noise using a joint prior of static and dynamic features," IEEE Trans. Speech Audio Process., vol. 12, no. 3, pp. 218-233, May 2004.
- (2004) IEEE Trans. Speech Audio Process , vol.12 , Issue.3 , pp. 218-233
- Deng, L.¹ Droppo, J.² Acero, A.³

33
- 65549153550
- Ph.D. dissertation Dept. of Elect. and Comput. Eng., Carnegie Mellon Univ., Pittsburgh, PA
- P. Moreno, "Speech recognition in noisy environments," Ph.D. dissertation, Dept. of Elect. and Comput. Eng., Carnegie Mellon Univ., Pittsburgh, PA, 1996.
- (1996) Speech Recognition in Noisy Environments
- Moreno, P.¹

34
- 0004319970
- Norwell MA Kluwer
- A. Acero, Acoustical and Environmental Robustness in Automatic Speech Recognition. Norwell, MA: Kluwer, 1993.
- (1993) Acoustical and Environmental Robustness in Automatic Speech Recognition
- Acero, A.¹

35
- 33750376174
- Model-based feature enhancement with uncertainty decoding for noise robust ASR
- DOI 10.1016/j.specom.2005.12.006, PII S0167639306000057
- V. Stouten, H. V. Hamme, and P.Wambacq, "Model-based feature enhancement with uncertainty decoding for noise robust ASR," Speech Commun., vol. 48, no. 11, pp. 1502-1514, Nov. 2006. (Pubitemid 44634766)
- (2006) Speech Communication , vol.48 , Issue.11 , pp. 1502-1514
- Stouten, V.¹ Van Hamme, H.² Wambacq, P.³

36
- 84898946024
- One microphone source separation
- S. T. Roweis, "One microphone source separation," in Proc. Neuroal Inf. Process. Syst., 2001, pp. 793-799.
- (2001) Proc. Neuroal Inf. Process. Syst , pp. 793-799
- Roweis, S.T.¹

37
- 85009230793
- Factorial models and refiltering for speech separation and denoising
- S. T. Roweis, "Factorial models and refiltering for speech separation and denoising," in Proc. EUROSPEECH, 2003, pp. 1009-1012.
- (2003) Proc. EUROSPEECH , pp. 1009-1012
- Roweis, S.T.¹

38
- 85032751986
- Single-channel multitalker speech recognition
- Nov.
- S. J. Rennie, J. R. Hershey, and P. A. Olsen, "Single-channel multitalker speech recognition," IEEE Signal Process. Mag., vol. 27, no. 6, pp. 66-80, Nov. 2010.
- (2010) IEEE Signal Process. Mag , vol.27 , Issue.6 , pp. 66-80
- Rennie, S.J.¹ Hershey, J.R.² Olsen, P.A.³

39
- 0001341675
- Numerical computation of multivariate normal probabilities
- Jun
- A. Genz, "Numerical computation of multivariate normal probabilities," J. Comput. Graph. Stat., vol. 1, no. 2, pp. 141-149, Jun. 1992.
- (1992) J. Comput. Graph. Stat , vol.1 , Issue.2 , pp. 141-149
- Genz, A.¹

40
- 79959843445
- On using missing-feature theory with cepstral features-Approximations to the multivariate integral
- F. Seide and P. Zhao, "On using missing-feature theory with cepstral features-Approximations to the multivariate integral," in Proc. Interspeech, 2010, pp. 2094-2097.
- (2010) Proc. Interspeech , pp. 2094-2097
- Seide, F.¹ Zhao, P.²

41
- 84878524161
- Log-spectral feature reconstruction based on an occlusion model for noise robust speech recognition
- J. A. González, A.M. Peinado,A. M. Gómez, andN.Ma, "Log-spectral feature reconstruction based on an occlusion model for noise robust speech recognition," in Proc. Interspeech, 2012.
- (2012) Proc. Interspeech
- González, J.A.¹ Peinado, A.M.² Gómez, A.M.³ Ma, N.⁴

42
- 84987702417
- The Aurora experimental framework for the performance evaluation of speech recognition systems under noisy conditions
- H. G. Hirsch and D. Pearce, "The Aurora experimental framework for the performance evaluation of speech recognition systems under noisy conditions," in Proc. ICSLP, 2000, pp. 29-32.
- (2000) Proc. ICSLP , pp. 29-32
- Hirsch, H.G.¹ Pearce, D.²

43
- 0003946510
- NewYork:Springer-Verlag
- I. T. Jolliffe, Principal Component Analysis. NewYork:Springer-Verlag, 2002.
- (2002) Principal Component Analysis
- Jolliffe, I.T.¹

44
- 0024610919
- A tutorial on hidden Markov models and selected applications in speech recognition
- Feb
- L. Rabiner, "A tutorial on hidden Markov models and selected applications in speech recognition," Proc. IEEE, vol. 77, no. 2, pp. 257-286, Feb. 1989.
- (1989) Proc. IEEE , vol.77 , Issue.2 , pp. 257-286
- Rabiner, L.¹

45
- 80051625382
- Experimental framework for the performance evaluation of speech recognition front-ends of large vocabulary task
- H. G. Hirsch, "Experimental framework for the performance evaluation of speech recognition front-ends of large vocabulary task," AURORA DSR Working Group, Tech. Rep., STQ, 2002.
- (2002) AURORA DSR Working Group, Tech. Rep., STQ
- Hirsch, H.G.¹

46
- 0013251345
- Std
- ETSI ES 201 108-Distributed Speech Recognition; Front-End Feature Extraction Algorithm; Compression Algorithms, Std., 2000.
- (2000) ETSI ES 201 108-Distributed Speech Recognition; Front-End Feature Extraction Algorithm; Compression Algorithms

47
- 0023921973
- Segmental durations in connectedspeech signals: Current results
- Apr
- T. H. Crystal and A. S. House, "Segmental durations in connectedspeech signals: Current results," J. Acoust. Soc. Amer., vol. 83, no. 4, pp. 1553-1573, Apr. 1988.
- (1988) J. Acoust. Soc. Amer , vol.83 , Issue.4 , pp. 1553-1573
- Crystal, T.H.¹ House, A.S.²

48
- 84872173300
- Speech Processing Transmission and Quality Aspects (STQ ETSI 202 050 v1.1.4 Std Nov
- Speech Processing, Transmission and Quality Aspects (STQ); Distributed speech recognition; Advanced front-end feature extraction algorithm; Compression algorithms, ETSI 202 050 v1.1.4 Std., Nov. 2005.
- (2005) Distributed Speech Recognition; Advanced Front-end Feature Extraction Algorithm; Compression Algorithms

49
- 0003446320
- New York Wiley
- N. L. Johnson, S. Kotz, and N. Balakrishnan, Continuous Univariate Distributions. New York: Wiley, 1994, vol. 1.
- (1994) Continuous Univariate Distributions , vol.1
- Johnson, N.L.¹ Kotz, S.² Balakrishnan, N.³

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.