메뉴 건너뛰기




Volumn 18, Issue 6, 2010, Pages 1612-1623

HMM-based reconstruction of unreliable spectrographic data for noise robust speech recognition

Author keywords

Hidden Markov models (HMMs); mask estimation; missing features (MFs); noise robust speech recognition; spectral reconstruction

Indexed keywords

BASELINE SYSTEMS; DECODING METHODS; DOWNSAMPLING; ESTIMATION METHODS; LOWER RESOLUTION; MEL-FREQUENCY CEPSTRAL COEFFICIENTS; MEMORY REQUIREMENTS; MINIMUM MEAN-SQUARE ERROR; MISSING FEATURES (MFS); MODEL PARAMETERS; NOISE ROBUST SPEECH RECOGNITION; PERFORMANCE BOUNDS; QUANTIZERS; REALISTIC SCENARIO; RECONSTRUCTION ALGORITHMS; RECONSTRUCTION METHOD; ROBUST RECOGNITION; SPECTRAL MAGNITUDES; SPECTRAL RECONSTRUCTION; SPECTROGRAPHIC DATA; SPEECH DATA; STRUCTURED MAPPINGS;

EID: 77955777921     PISSN: 15587916     EISSN: None     Source Type: Journal    
DOI: 10.1109/TASL.2009.2038811     Document Type: Article
Times cited : (19)

References (41)
  • 3
    • 0032141206 scopus 로고    scopus 로고
    • Cepstral domain segmental feature vector normalization for noise robust speech recognition
    • O. Viikki and K. Laurila, "Cepstral domain segmental feature vector normalization for noise robust speech recognition," Speech Commun., vol.25, pp. 133-147, 1998.
    • (1998) Speech Commun. , vol.25 , pp. 133-147
    • Viikki, O.1    Laurila, K.2
  • 4
    • 0142009990 scopus 로고    scopus 로고
    • Non-linear feature extraction for robust recognition in stationary and non-stationary noise
    • Q. Zhu and A. Alwan, "Non-linear feature extraction for robust recognition in stationary and non-stationary noise," Computer, Speech, Lang., vol.17, no.4, pp. 381-402, 2003.
    • (2003) Computer, Speech, Lang. , vol.17 , Issue.4 , pp. 381-402
    • Zhu, Q.1    Alwan, A.2
  • 5
    • 85032752225 scopus 로고    scopus 로고
    • Missing feature approaches in speech recognition
    • Sep.
    • B. Raj and R. Stern, "Missing feature approaches in speech recognition," IEEE Signal Process. Mag., vol.22, no.5, pp. 101-116, Sep. 2005.
    • (2005) IEEE Signal Process. Mag. , vol.22 , Issue.5 , pp. 101-116
    • Raj, B.1    Stern, R.2
  • 6
    • 0030671924 scopus 로고    scopus 로고
    • Missing data techniques for robust speech recognition
    • M. P. Cooke, A. Morris, and P. D. Green, "Missing data techniques for robust speech recognition," in Proc. ICASSP, 1997, vol.2, pp. 863-866.
    • (1997) Proc. ICASSP , vol.2 , pp. 863-866
    • Cooke, M.P.1    Morris, A.2    Green, P.D.3
  • 7
    • 85009063707 scopus 로고    scopus 로고
    • Soft decisions in missing feature data techniques for robust automatic speech recognition
    • J. Barker, L. Josifovski, M. Cooke, and P. Green, "Soft decisions in missing feature data techniques for robust automatic speech recognition," in Proc. ICSLP, 2000, pp. 373-376.
    • (2000) Proc. ICSLP , pp. 373-376
    • Barker, J.1    Josifovski, L.2    Cooke, M.3    Green, P.4
  • 8
    • 0035342414 scopus 로고    scopus 로고
    • Robust automatic speech recognition with missing and unreliable acoustic data
    • M. Cooke, P. Green, L. Josifovski, and A. Vizinho, "Robust automatic speech recognition with missing and unreliable acoustic data," Speech Commun., vol.34, pp. 267-285, 2001.
    • (2001) Speech Commun. , vol.34 , pp. 267-285
    • Cooke, M.1    Green, P.2    Josifovski, L.3    Vizinho, A.4
  • 9
    • 33750291256 scopus 로고    scopus 로고
    • Uncertainty decoding for distributed speech recognition over error-prone networks
    • V. Ion and R. Haeb-Umbach, "Uncertainty decoding for distributed speech recognition over error-prone networks," Speech Commun., vol.48, pp. 1435-1446, 2006.
    • (2006) Speech Commun. , vol.48 , pp. 1435-1446
    • Ion, V.1    Haeb-Umbach, R.2
  • 11
    • 4644336054 scopus 로고    scopus 로고
    • Reconstruction of missing features for robust speech recognition
    • B. Raj, M. L. Seltzer, and R. M. Stern, "Reconstruction of missing features for robust speech recognition," Speech Commun., vol.43, pp. 275-296, 2004.
    • (2004) Speech Commun. , vol.43 , pp. 275-296
    • Raj, B.1    Seltzer, M.L.2    Stern, R.M.3
  • 12
    • 0024610919 scopus 로고
    • A tutorial on hidden markov models and selected applications in speech recognition
    • Feb.
    • L. R. Rabiner, "A tutorial on hidden markov models and selected applications in speech recognition," Proc. IEEE, vol.77, no.2, pp. 257-286, Feb. 1989.
    • (1989) Proc. IEEE , vol.77 , Issue.2 , pp. 257-286
    • Rabiner, L.R.1
  • 13
    • 0015143526 scopus 로고
    • Convolutional codes and their performance in communication systems
    • Oct.
    • A. Viterbi, "Convolutional codes and their performance in communication systems," IEEE Trans. Commun., vol.COM-19, no.5, pt. 1, pp. 751-772, Oct. 1971.
    • (1971) IEEE Trans. Commun. , vol.COM-19 , Issue.5 PART 1 , pp. 751-772
    • Viterbi, A.1
  • 17
    • 0242721421 scopus 로고    scopus 로고
    • HMM-based channel error mitigation and its application to distributed speech recognition
    • Nov.
    • A. M. Peinado, V. Sanchez, J. L. Perez-Cordoba, and A. de la Torre, "HMM-based channel error mitigation and its application to distributed speech recognition," Speech Commun., vol.41/4, pp. 549-561, Nov. 2003.
    • (2003) Speech Commun. , vol.41 , Issue.4 , pp. 549-561
    • Peinado, A.M.1    Sanchez, V.2    Perez-Cordoba, J.L.3    De La Torre, A.4
  • 19
    • 51449120334 scopus 로고    scopus 로고
    • An efficient approximation of the forward-backward algorithm to deal with packet loss, with applications to remote speech recognition
    • B. J. Borgstrom and A. Alwan, "An efficient approximation of the forward-backward algorithm to deal with packet loss, with applications to remote speech recognition," in Proc. ICASSP, 2008, pp. 4425-4428.
    • (2008) Proc. ICASSP , pp. 4425-4428
    • Borgstrom, B.J.1    Alwan, A.2
  • 20
    • 84867196386 scopus 로고    scopus 로고
    • HMM-based estimation of unreliable spectral components for noise robust speech recognition
    • B. J. Borgstrom and A. Alwan, "HMM-based estimation of unreliable spectral components for noise robust speech recognition," in Proc. In-terspeech, 2008, pp. 1769-1772.
    • (2008) Proc. In-terspeech , pp. 1769-1772
    • Borgstrom, B.J.1    Alwan, A.2
  • 21
    • 19944382585 scopus 로고    scopus 로고
    • Enabling new speech driven services for mobile devices: An overview of the ETSI standards activities for distributed speech recognition front-ends
    • May
    • D. Pearce, "Enabling new speech driven services for mobile devices: An overview of the ETSI standards activities for distributed speech recognition front-ends," in Proc. AVIOS 2000: Speech Appl. Conf., May 2000, vol.5, pp. 1-6.
    • (2000) Proc. AVIOS 2000: Speech Appl. Conf. , vol.5 , pp. 1-6
    • Pearce, D.1
  • 22
    • 0035396555 scopus 로고    scopus 로고
    • Noise power spectral density estimation based on optimal smoothing and minimum statistics
    • Jul.
    • R. Martin, "Noise power spectral density estimation based on optimal smoothing and minimum statistics," IEEE Trans. Speech Audio Process., vol.9, no.5, pp. 504-512, Jul. 2001.
    • (2001) IEEE Trans. Speech Audio Process. , vol.9 , Issue.5 , pp. 504-512
    • Martin, R.1
  • 23
    • 0041360463 scopus 로고    scopus 로고
    • Noise spectrum estimation in adverse environments: Improved minima controlled recursive averaging
    • Sep.
    • I. Cohen, "Noise spectrum estimation in adverse environments: Improved minima controlled recursive averaging," IEEE Trans. Speech Audio Process., vol.11, no.5, pp. 466-475, Sep. 2003.
    • (2003) IEEE Trans. Speech Audio Process. , vol.11 , Issue.5 , pp. 466-475
    • Cohen, I.1
  • 24
    • 4644317224 scopus 로고    scopus 로고
    • A Bayesian classifier for spec-trographic mask estimation for missing feature speech recognition
    • M. L. Seltzer, B. Raj, and R. M. Stern, "A Bayesian classifier for spec-trographic mask estimation for missing feature speech recognition," Speech Commun., vol.43, pp. 379-393, 2004.
    • (2004) Speech Commun. , vol.43 , pp. 379-393
    • Seltzer, M.L.1    Raj, B.2    Stern, R.M.3
  • 25
    • 66149130450 scopus 로고    scopus 로고
    • Multi-resolution soft-features for channel-robust distributed speech recognition
    • V. Ion and R. Haeb-Umbach, "Multi-resolution soft-features for channel-robust distributed speech recognition," in Proc. Interspeech, 2007, pp. 594-597.
    • (2007) Proc. Interspeech , pp. 594-597
    • Ion, V.1    Haeb-Umbach, R.2
  • 26
    • 33947703708 scopus 로고    scopus 로고
    • Band-independent mask estimation for missing feature reconstruction in the presence of unknown background noise
    • W. Kim and R. Stern, "Band-independent mask estimation for missing feature reconstruction in the presence of unknown background noise," in Proc. ICASSP, 2006, pp. 305-308.
    • (2006) Proc. ICASSP , pp. 305-308
    • Kim, W.1    Stern, R.2
  • 27
    • 66149120230 scopus 로고    scopus 로고
    • Improved a posteriori speech presence probability estimation based on a likelihood ratio with fixed priors
    • Jul.
    • T. Gerkmann, C. Breithaupt, and R. Martin, "Improved a posteriori speech presence probability estimation based on a likelihood ratio with fixed priors," IEEE Trans. Audio, Speech, Lang. Process., vol.16, no.5, pp. 910-919, Jul. 2008.
    • (2008) IEEE Trans. Audio, Speech, Lang. Process. , vol.16 , Issue.5 , pp. 910-919
    • Gerkmann, T.1    Breithaupt, C.2    Martin, R.3
  • 28
    • 69249139982 scopus 로고    scopus 로고
    • Conditionally linear Gaussian models for estimating vocal tract resonances
    • D. Rudoy, D. Spendley, and P. Wolfe, "Conditionally linear Gaussian models for estimating vocal tract resonances," in Proc. Interspeech, 2007, pp. 526-529.
    • (2007) Proc. Interspeech , pp. 526-529
    • Rudoy, D.1    Spendley, D.2    Wolfe, P.3
  • 29
    • 0002603206 scopus 로고    scopus 로고
    • Missing data theory, spectral subtraction, and signal-to-noise estimation for robust ASR: An integrated study
    • A. Vizinho, P. Green, M. Cooke, and L. Josifovski, "Missing data theory, spectral subtraction, and signal-to-noise estimation for robust ASR: An integrated study," in Proc. Eurospeech, 1999.
    • (1999) Proc. Eurospeech
    • Vizinho, A.1    Green, P.2    Cooke, M.3    Josifovski, L.4
  • 33
    • 0018455310 scopus 로고
    • Suppression of acoustic noise in speech using spectral subtraction
    • Apr.
    • S. F. Boll, "Suppression of acoustic noise in speech using spectral subtraction," IEEE Trans. Acoustics, Speech, Signal Process., vol.ASSP-27, no.2, pp. 113-120, Apr. 1979.
    • (1979) IEEE Trans. Acoustics, Speech, Signal Process. , vol.ASSP-27 , Issue.2 , pp. 113-120
    • Boll, S.F.1
  • 35
    • 85009070292 scopus 로고    scopus 로고
    • Large-vocabulary speech recognition under adverse acoustic environments
    • L. Deng, A. Acero, M. Plumpe, and X. Huang, "Large-vocabulary speech recognition under adverse acoustic environments," in Proc. ICSLP, 2000.
    • (2000) Proc. ICSLP
    • Deng, L.1    Acero, A.2    Plumpe, M.3    Huang, X.4
  • 36
    • 33746753361 scopus 로고    scopus 로고
    • Adaptation of children's speech with limited data based on formant-like peak alignment
    • X. Cui and A. Alwan, "Adaptation of children's speech with limited data based on formant-like peak alignment," Comput. Speech, Lang., vol.20, no.4, pp. 400-419, 2006.
    • (2006) Comput. Speech, Lang. , vol.20 , Issue.4 , pp. 400-419
    • Cui, X.1    Alwan, A.2
  • 37
    • 0142009990 scopus 로고    scopus 로고
    • Non-linear feature extraction for robust recognition in stationary and non-stationary noise
    • Q. Zhu and A. Alwan, "Non-linear feature extraction for robust recognition in stationary and non-stationary noise," Comput. Speech, Lang., vol.17, no.4, pp. 381-402, 2003.
    • (2003) Comput. Speech, Lang. , vol.17 , Issue.4 , pp. 381-402
    • Zhu, Q.1    Alwan, A.2
  • 38
    • 0021892216 scopus 로고
    • Speech enhancement using a minimum mean-square log-spectral amplitude estimator
    • Apr.
    • Y. Ephraim and D. Malah, "Speech enhancement using a minimum mean-square log-spectral amplitude estimator," IEEE Trans. Acoust., Speech, Signal Process., vol.33, no.2, pp. 443-445, Apr. 1985.
    • (1985) IEEE Trans. Acoust., Speech, Signal Process. , vol.33 , Issue.2 , pp. 443-445
    • Ephraim, Y.1    Malah, D.2
  • 39
    • 0031238095 scopus 로고    scopus 로고
    • A model of dynamic auditory perception and its application to robust word recognition
    • Sep.
    • B. Strope and A. Alwan, "A model of dynamic auditory perception and its application to robust word recognition," IEEE Trans. Speech Audio Process., vol.5, no.5, pp. 451-464, Sep. 1997.
    • (1997) IEEE Trans. Speech Audio Process. , vol.5 , Issue.5 , pp. 451-464
    • Strope, B.1    Alwan, A.2
  • 40
    • 0025041264 scopus 로고
    • Perceptual Linear Predictive (PLP) analysis of speech
    • H. Hermansky, "Perceptual Linear Predictive (PLP) analysis of speech," JASA, vol.87, no.4, pp. 1738-1752, 1990.
    • (1990) JASA , vol.87 , Issue.4 , pp. 1738-1752
    • Hermansky, H.1
  • 41
    • 38849170676 scopus 로고    scopus 로고
    • Distributed Speech Recognition; Front-End Feature Extraction Algorithms; Compression Algorithms, ETSI ES 202 050 v1.1.1, 2007-2010, ETSI Standard Doc.
    • Speech Processing, Transmission, and Quality Aspects (STQ); Distributed Speech Recognition; Front-End Feature Extraction Algorithms; Compression Algorithms, ETSI ES 202 050 v1.1.1, 2007-2010, ETSI Standard Doc.
    • Speech Processing Transmission, and Quality Aspects (STQ)


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.