메뉴 건너뛰기




Volumn 21, Issue 3, 2013, Pages 624-635

MMSE-based missing-feature reconstruction with temporal modeling for robust speech recognition

Author keywords

Minimum mean square error estimation; missing feature; robust speech recognition; spectral reconstruction

Indexed keywords

BASELINE SYSTEMS; FEATURE COMPENSATION; GAUSSIAN MIXTURE MODEL; LOG-SPECTRAL DOMAIN; MINIMUM MEAN SQUARE ERROR ESTIMATIONS; MISSING-FEATURE; NOISE ROBUST SPEECH RECOGNITION; NOVEL TECHNIQUES; RECOGNITION PERFORMANCE; RECONSTRUCTION TECHNIQUES; ROBUST SPEECH RECOGNITION; SEQUENTIAL STRUCTURE; SPECTRAL RECONSTRUCTION; TEMPORAL CONSTRAINTS; TEMPORAL MODELING; TIME CORRELATIONS; TIME FREQUENCY;

EID: 84872188748     PISSN: 15587916     EISSN: None     Source Type: Journal    
DOI: 10.1109/TASL.2012.2229982     Document Type: Article
Times cited : (23)

References (49)
  • 1
    • 0029288202 scopus 로고
    • Speech recognition in noisy environments: A survey
    • Apr
    • Y. Gong, "Speech recognition in noisy environments: A survey," Speech Commun., vol. 16, no. 3, pp. 261-291, Apr. 1995.
    • (1995) Speech Commun , vol.16 , Issue.3 , pp. 261-291
    • Gong, Y.1
  • 4
    • 0035342414 scopus 로고    scopus 로고
    • Robust automatic speech recognition with missing and unreliable acoustic data
    • DOI 10.1016/S0167-6393(00)00034-0, PII S0167639300000340
    • M. Cooke, P. Green, L. Josifovski, and A. Vizinho, "Robust automatic speech recognition with missing and unreliable acoustic data," Speech Commun., vol. 34, no. 3, pp. 267-285, Jun. 2001. (Pubitemid 32284867)
    • (2001) Speech Communication , vol.34 , Issue.3 , pp. 267-285
    • Cooke, M.1    Green, P.2    Josifovski, L.3    Vizinho, A.4
  • 5
    • 4644336054 scopus 로고    scopus 로고
    • Reconstruction of missing features for robust speech recognition
    • Sep
    • B. Raj, M. L. Seltzer, and R. M. Stern, "Reconstruction of missing features for robust speech recognition," Speech Commun., vol. 43, no. 4, pp. 275-296, Sep. 2004.
    • (2004) Speech Commun , vol.43 , Issue.4 , pp. 275-296
    • Raj, B.1    Seltzer, M.L.2    Stern, R.M.3
  • 7
    • 0029249228 scopus 로고
    • Spectral redundancy: Intelligibility of sentences heard through narrow spectral slits
    • Feb
    • R. Warren, K. Riener, J. Bashford, and B. Brubaker, "Spectral redundancy: Intelligibility of sentences heard through narrow spectral slits," Percept. Psychophys., vol. 57, no. 2, pp. 175-182, Feb. 1995.
    • (1995) Percept. Psychophys , vol.57 , Issue.2 , pp. 175-182
    • Warren, R.1    Riener, K.2    Bashford, J.3    Brubaker, B.4
  • 8
    • 33644661135 scopus 로고    scopus 로고
    • A glimpsing model of speech perception in noise
    • DOI 10.1121/1.2166600
    • M. Cooke, "A glimpsing model of speech perception in noise," J. Acoust. Soc. Amer., vol. 119, no. 3, pp. 1562-1573, Mar. 2006. (Pubitemid 43326025)
    • (2006) Journal of the Acoustical Society of America , vol.119 , Issue.3 , pp. 1562-1573
    • Cooke, M.1
  • 9
    • 0025681008 scopus 로고
    • Hidden Markov model decomposition of speech and noise
    • A. P. Varga and R. K. Moore, "Hidden Markov model decomposition of speech and noise," in Proc. ICASSP, 1990, pp. 845-848.
    • (1990) Proc. ICASSP , pp. 845-848
    • Varga, A.P.1    Moore, R.K.2
  • 10
    • 85009063707 scopus 로고    scopus 로고
    • Soft decisions in missing data techniques for robust automatic speech recognition
    • J. Barker, L. Josifovski, M. P. Cooke, and P. D. Green, "Soft decisions in missing data techniques for robust automatic speech recognition," in Proc. ICSLP, 2000, pp. 373-376.
    • (2000) Proc. ICSLP , pp. 373-376
    • Barker, J.1    Josifovski, L.2    Cooke, M.P.3    Green, P.D.4
  • 11
    • 34748817500 scopus 로고    scopus 로고
    • Exploiting correlogram structure for robust speech recognition with multiple speech sources
    • DOI 10.1016/j.specom.2007.05.003, PII S016763930700088X
    • N. Ma, P. Green, J. Barker, and A. Coy, "Exploiting correlogram structure for robust speech recognition with multiple speech sources," Speech Commun., vol. 49, no. 12, pp. 874-891, Dec. 2007. (Pubitemid 47488511)
    • (2007) Speech Communication , vol.49 , Issue.12 , pp. 874-891
    • Ma, N.1    Green, P.2    Barker, J.3    Coy, A.4
  • 13
    • 4644317224 scopus 로고    scopus 로고
    • A bayesian classifier for spectrographic mask estimation for missing feature speech recognition
    • Sep
    • M. L. Seltzer, B. Raj, and R. M. Stern, "A bayesian classifier for spectrographic mask estimation for missing feature speech recognition," Speech Commun., vol. 43, no. 4, pp. 379-393, Sep. 2004.
    • (2004) Speech Commun , vol.43 , Issue.4 , pp. 379-393
    • Seltzer, M.L.1    Raj, B.2    Stern, R.M.3
  • 14
    • 78149438339 scopus 로고    scopus 로고
    • A statistical approach to Mel-domain mask estimation for missing-feature ASR
    • Nov.
    • B. J. Borgström and A. Alwan, "A statistical approach to Mel-domain mask estimation for missing-feature ASR," IEEE Signal Process. Lett., vol. 17, no. 11, pp. 941-944, Nov. 2010.
    • (2010) IEEE Signal Process. Lett , vol.17 , Issue.11 , pp. 941-944
    • Borgström, B.J.1    Alwan, A.2
  • 15
    • 78649325568 scopus 로고    scopus 로고
    • Mask classification for missing-feature reconstruction for robust speech recognition in unknown background noise
    • Jan.
    • W. Kim and R. M. Stern, "Mask classification for missing-feature reconstruction for robust speech recognition in unknown background noise," Speech Commun., vol. 53, no. 1, pp. 1-11, Jan. 2011.
    • (2011) Speech Commun , vol.53 , Issue.1 , pp. 1-11
    • Kim, W.1    Stern, R.M.2
  • 16
    • 79959823862 scopus 로고    scopus 로고
    • Mask estimation in non-stationary noise environments for missing feature based robust speech recognition
    • S. Badiezadegan and R. C. Rose, "Mask estimation in non-stationary noise environments for missing feature based robust speech recognition," in Proc. Interspeech, 2010, pp. 2062-2065.
    • (2010) Proc. Interspeech , pp. 2062-2065
    • Badiezadegan, S.1    Rose, R.C.2
  • 17
    • 33847629729 scopus 로고    scopus 로고
    • On noise masking for automatic missing data speech recognition: A survey and discussion
    • DOI 10.1016/j.csl.2006.08.001, PII S0885230806000301
    • C. Cerisara, S. Demange, and J.-P. Haton, "On noise masking for automatic missing data speech recognition: A survey and discussion," Comput. Speech Lang., vol. 21, no. 3, pp. 443-457, July 2007. (Pubitemid 46367508)
    • (2007) Computer Speech and Language , vol.21 , Issue.3 , pp. 443-457
    • Cerisara, C.1    Demange, S.2    Haton, J.-P.3
  • 18
    • 11144316019 scopus 로고    scopus 로고
    • Decoding speech in the presence of other sources
    • DOI 10.1016/j.specom.2004.05.002, PII S0167639304000615
    • J. Barker,M. Cooke, and D. Ellis, "Decoding speech in the presence of other sources," Speech Commun., vol. 45, no. 1, pp. 5-25, Jan. 2005. (Pubitemid 40034706)
    • (2005) Speech Communication , vol.45 , Issue.1 , pp. 5-25
    • Barker, J.P.1    Cooke, M.P.2    Ellis, D.P.W.3
  • 19
    • 84856140165 scopus 로고    scopus 로고
    • Combining speech fragment decoding and adaptive noise floor modelling
    • Mar.
    • N. Ma, J. Barker, H. Christensen, and P. Green, "Combining speech fragment decoding and adaptive noise floor modelling," IEEE Trans. Audio Speech Lang. Process., vol. 20, no. 3, pp. 818-827, Mar. 2012.
    • (2012) IEEE Trans. Audio Speech Lang. Process , vol.20 , Issue.3 , pp. 818-827
    • Ma, N.1    Barker, J.2    Christensen, H.3    Green, P.4
  • 20
    • 85009128803 scopus 로고    scopus 로고
    • PROSPECT features and their application to missing data techniques for robust speech recognition
    • H. Van Hamme, "PROSPECT features and their application to missing data techniques for robust speech recognition," in Proc. Interspeech, 2004, pp. 101-104.
    • (2004) Proc. Interspeech , pp. 101-104
    • Van Hamme, H.1
  • 21
    • 77957739976 scopus 로고    scopus 로고
    • Advances in missing feature techniques for robust large-vocabulary continuous speech recognition
    • Jan.
    • M. Van Segbroeck and H. Van Hamme, "Advances in missing feature techniques for robust large-vocabulary continuous speech recognition," IEEE Trans. Audio Speech Lang. Process., vol. 19, no. 1, pp. 123-137, Jan. 2011.
    • (2011) IEEE Trans. Audio Speech Lang. Process , vol.19 , Issue.1 , pp. 123-137
    • Van Segbroeck, M.1    Van Hamme, H.2
  • 22
    • 77956506956 scopus 로고    scopus 로고
    • Missing-feature reconstruction by leveraging temporal spectral correlation for robust speech recognition in background noise conditions
    • Nov.
    • W. Kim and J. Hansen, "Missing-feature reconstruction by leveraging temporal spectral correlation for robust speech recognition in background noise conditions," IEEE Trans. Audio Speech Lang. Process., vol. 18, no. 8, pp. 2111-2120, Nov. 2010.
    • (2010) IEEE Trans. Audio Speech Lang. Process , vol.18 , Issue.8 , pp. 2111-2120
    • Kim, W.1    Hansen, J.2
  • 23
    • 33846190246 scopus 로고    scopus 로고
    • Reconstructing spectral vectors with uncertain spectrographic masks for robust speech recognition
    • B. Raj and R. Singh, "Reconstructing spectral vectors with uncertain spectrographic masks for robust speech recognition," in Proc. ASRU, 2005, pp. 65-70.
    • (2005) Proc. ASRU , pp. 65-70
    • Raj, B.1    Singh, R.2
  • 24
    • 77949695902 scopus 로고    scopus 로고
    • Compressive sensing for missing data imputation in noise robust speech recognition
    • Apr.
    • J. F. Gemmeke, H. Van Hamme, B. Cranen, and L. Boves, "Compressive sensing for missing data imputation in noise robust speech recognition," IEEE J. Sel. Topics Signal Process., vol. 4, no. 2, pp. 272-287, Apr. 2010.
    • (2010) IEEE J. Sel. Topics Signal Process , vol.4 , Issue.2 , pp. 272-287
    • Gemmeke, J.F.1    Van Hamme, H.2    Cranen, B.3    Boves, L.4
  • 25
    • 70349198435 scopus 로고    scopus 로고
    • Particle filter based soft-mask estimation for missing feature reconstruction
    • F. Faubel, H. Raja, J. McDonough, and D. Klakow, "Particle filter based soft-mask estimation for missing feature reconstruction," in Proc. IWAENC, 2008.
    • (2008) Proc. IWAENC
    • Faubel, F.1    Raja, H.2    McDonough, J.3    Klakow, D.4
  • 26
    • 70349226857 scopus 로고    scopus 로고
    • Bounded conditionalmean imputation with Gaussian mixture models: A reconstruction approach to partly occluded features
    • F. Faubel, J. McDonough, and D. Klakow, "Bounded conditionalmean imputation with Gaussian mixture models: A reconstruction approach to partly occluded features," in Proc. ICASSP, 2009, pp. 3869-3872.
    • (2009) Proc. ICASSP , pp. 3869-3872
    • Faubel, F.1    McDonough, J.2    Klakow, D.3
  • 27
    • 84867612282 scopus 로고    scopus 로고
    • Combining missing-data reconstruction and uncertainty decoding for robust speech recognition
    • J. A. González, A. M. Peinado, A. M. Gómez, N. Ma, and J. Barker, "Combining missing-data reconstruction and uncertainty decoding for robust speech recognition," in Proc. ICASSP, 2012, pp. 4693-4696.
    • (2012) Proc. ICASSP , pp. 4693-4696
    • González, J.A.1    Peinado, A.M.2    Gómez, A.M.3    Ma, N.4    Barker, J.5
  • 28
    • 0242721421 scopus 로고    scopus 로고
    • HMM-based channel error mitigation and its application to distributed speech recognition
    • Nov
    • A. M. Peinado, V. Sanchez, J. L. Perez-Cordoba, and A. de la Torre, "HMM-based channel error mitigation and its application to distributed speech recognition," Speech Commun., vol. 41, no. 4, pp. 549-561, Nov. 2003.
    • (2003) Speech Commun , vol.41 , Issue.4 , pp. 549-561
    • Peinado, A.M.1    Sanchez, V.2    Perez-Cordoba, J.L.3    De La Torre, A.4
  • 29
    • 85008009592 scopus 로고    scopus 로고
    • EfficientMMSE estimation and uncertainty processing for multienvironment robust speech recognition
    • Jul.
    • J. A. González, A. M. Peinado, A. M. Gomez, and J. L. Carmona, "EfficientMMSE estimation and uncertainty processing for multienvironment robust speech recognition," IEEE Trans. Audio Speech Lang. Process., vol. 19, no. 5, pp. 1206-1220, Jul. 2011.
    • (2011) IEEE Trans. Audio Speech Lang. Process , vol.19 , Issue.5 , pp. 1206-1220
    • González, J.A.1    Peinado, A.M.2    Gomez, A.M.3    Carmona, J.L.4
  • 31
    • 77955777921 scopus 로고    scopus 로고
    • HMM-based reconstruction of unreliable spectrographic data for noise robust speech recognition
    • Aug.
    • B. J. Borgström and A. Alwan, "HMM-based reconstruction of unreliable spectrographic data for noise robust speech recognition," IEEE Trans. Audio Speech Lang. Process., vol. 18, no. 6, pp. 1612-1623, Aug. 2010.
    • (2010) IEEE Trans. Audio Speech Lang. Process , vol.18 , Issue.6 , pp. 1612-1623
    • Borgström, B.J.1    Alwan, A.2
  • 32
    • 2442551863 scopus 로고    scopus 로고
    • Estimating cepstrum of speech under the presence of noise using a joint prior of static and dynamic features
    • May
    • L. Deng, J. Droppo, and A. Acero, "Estimating cepstrum of speech under the presence of noise using a joint prior of static and dynamic features," IEEE Trans. Speech Audio Process., vol. 12, no. 3, pp. 218-233, May 2004.
    • (2004) IEEE Trans. Speech Audio Process , vol.12 , Issue.3 , pp. 218-233
    • Deng, L.1    Droppo, J.2    Acero, A.3
  • 33
    • 65549153550 scopus 로고    scopus 로고
    • Ph.D. dissertation Dept. of Elect. and Comput. Eng., Carnegie Mellon Univ., Pittsburgh, PA
    • P. Moreno, "Speech recognition in noisy environments," Ph.D. dissertation, Dept. of Elect. and Comput. Eng., Carnegie Mellon Univ., Pittsburgh, PA, 1996.
    • (1996) Speech Recognition in Noisy Environments
    • Moreno, P.1
  • 35
    • 33750376174 scopus 로고    scopus 로고
    • Model-based feature enhancement with uncertainty decoding for noise robust ASR
    • DOI 10.1016/j.specom.2005.12.006, PII S0167639306000057
    • V. Stouten, H. V. Hamme, and P.Wambacq, "Model-based feature enhancement with uncertainty decoding for noise robust ASR," Speech Commun., vol. 48, no. 11, pp. 1502-1514, Nov. 2006. (Pubitemid 44634766)
    • (2006) Speech Communication , vol.48 , Issue.11 , pp. 1502-1514
    • Stouten, V.1    Van Hamme, H.2    Wambacq, P.3
  • 37
    • 85009230793 scopus 로고    scopus 로고
    • Factorial models and refiltering for speech separation and denoising
    • S. T. Roweis, "Factorial models and refiltering for speech separation and denoising," in Proc. EUROSPEECH, 2003, pp. 1009-1012.
    • (2003) Proc. EUROSPEECH , pp. 1009-1012
    • Roweis, S.T.1
  • 38
    • 85032751986 scopus 로고    scopus 로고
    • Single-channel multitalker speech recognition
    • Nov.
    • S. J. Rennie, J. R. Hershey, and P. A. Olsen, "Single-channel multitalker speech recognition," IEEE Signal Process. Mag., vol. 27, no. 6, pp. 66-80, Nov. 2010.
    • (2010) IEEE Signal Process. Mag , vol.27 , Issue.6 , pp. 66-80
    • Rennie, S.J.1    Hershey, J.R.2    Olsen, P.A.3
  • 39
    • 0001341675 scopus 로고
    • Numerical computation of multivariate normal probabilities
    • Jun
    • A. Genz, "Numerical computation of multivariate normal probabilities," J. Comput. Graph. Stat., vol. 1, no. 2, pp. 141-149, Jun. 1992.
    • (1992) J. Comput. Graph. Stat , vol.1 , Issue.2 , pp. 141-149
    • Genz, A.1
  • 40
    • 79959843445 scopus 로고    scopus 로고
    • On using missing-feature theory with cepstral features-Approximations to the multivariate integral
    • F. Seide and P. Zhao, "On using missing-feature theory with cepstral features-Approximations to the multivariate integral," in Proc. Interspeech, 2010, pp. 2094-2097.
    • (2010) Proc. Interspeech , pp. 2094-2097
    • Seide, F.1    Zhao, P.2
  • 41
    • 84878524161 scopus 로고    scopus 로고
    • Log-spectral feature reconstruction based on an occlusion model for noise robust speech recognition
    • J. A. González, A.M. Peinado,A. M. Gómez, andN.Ma, "Log-spectral feature reconstruction based on an occlusion model for noise robust speech recognition," in Proc. Interspeech, 2012.
    • (2012) Proc. Interspeech
    • González, J.A.1    Peinado, A.M.2    Gómez, A.M.3    Ma, N.4
  • 42
    • 84987702417 scopus 로고    scopus 로고
    • The Aurora experimental framework for the performance evaluation of speech recognition systems under noisy conditions
    • H. G. Hirsch and D. Pearce, "The Aurora experimental framework for the performance evaluation of speech recognition systems under noisy conditions," in Proc. ICSLP, 2000, pp. 29-32.
    • (2000) Proc. ICSLP , pp. 29-32
    • Hirsch, H.G.1    Pearce, D.2
  • 44
    • 0024610919 scopus 로고
    • A tutorial on hidden Markov models and selected applications in speech recognition
    • Feb
    • L. Rabiner, "A tutorial on hidden Markov models and selected applications in speech recognition," Proc. IEEE, vol. 77, no. 2, pp. 257-286, Feb. 1989.
    • (1989) Proc. IEEE , vol.77 , Issue.2 , pp. 257-286
    • Rabiner, L.1
  • 45
    • 80051625382 scopus 로고    scopus 로고
    • Experimental framework for the performance evaluation of speech recognition front-ends of large vocabulary task
    • H. G. Hirsch, "Experimental framework for the performance evaluation of speech recognition front-ends of large vocabulary task," AURORA DSR Working Group, Tech. Rep., STQ, 2002.
    • (2002) AURORA DSR Working Group, Tech. Rep., STQ
    • Hirsch, H.G.1
  • 47
    • 0023921973 scopus 로고
    • Segmental durations in connectedspeech signals: Current results
    • Apr
    • T. H. Crystal and A. S. House, "Segmental durations in connectedspeech signals: Current results," J. Acoust. Soc. Amer., vol. 83, no. 4, pp. 1553-1573, Apr. 1988.
    • (1988) J. Acoust. Soc. Amer , vol.83 , Issue.4 , pp. 1553-1573
    • Crystal, T.H.1    House, A.S.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.