SCOPUS 정보 검색 플랫폼

IEEE Transactions on Audio, Speech and Language Processing

Volumn 18, Issue 6, 2010, Pages 1612-1623

HMM-based reconstruction of unreliable spectrographic data for noise robust speech recognition

a UNIVERSITY OF CALIFORNIA (United States)

Author keywords

Hidden Markov models (HMMs); mask estimation; missing features (MFs); noise robust speech recognition; spectral reconstruction

Indexed keywords

BASELINE SYSTEMS; DECODING METHODS; DOWNSAMPLING; ESTIMATION METHODS; LOWER RESOLUTION; MEL-FREQUENCY CEPSTRAL COEFFICIENTS; MEMORY REQUIREMENTS; MINIMUM MEAN-SQUARE ERROR; MISSING FEATURES (MFS); MODEL PARAMETERS; NOISE ROBUST SPEECH RECOGNITION; PERFORMANCE BOUNDS; QUANTIZERS; REALISTIC SCENARIO; RECONSTRUCTION ALGORITHMS; RECONSTRUCTION METHOD; ROBUST RECOGNITION; SPECTRAL MAGNITUDES; SPECTRAL RECONSTRUCTION; SPECTROGRAPHIC DATA; SPEECH DATA; STRUCTURED MAPPINGS;

COMPUTATIONAL EFFICIENCY; ESTIMATION; HIDDEN MARKOV MODELS; NATURAL LANGUAGE PROCESSING SYSTEMS; REPAIR; STOCHASTIC MODELS;

SPEECH RECOGNITION;

EID: 77955777921 PISSN: 15587916 EISSN: None Source Type: Journal
DOI: 10.1109/TASL.2009.2038811 Document Type: Article

Times cited : (19)

References (41)

1
- 0004319970
- Norwell, MA: Kluwer
- A. Acero, Acoustic and Environemntal Robustness in Automatic Speech Recognition. Norwell, MA: Kluwer, 1993.
- (1993) Acoustic and Environemntal Robustness in Automatic Speech Recognition
- Acero, A.¹

2
- 0028517164
- RASTA processing of speech
- Jul.
- H. Hermansky and N. Morgan, "RASTA processing of speech," IEEE Trans. Speech Audio Process., vol.2, no.4, pp. 578-589, Jul. 1994.
- (1994) IEEE Trans. Speech Audio Process , vol.2 , Issue.4 , pp. 578-589
- Hermansky, H.¹ Morgan, N.²

3
- 0032141206
- Cepstral domain segmental feature vector normalization for noise robust speech recognition
- O. Viikki and K. Laurila, "Cepstral domain segmental feature vector normalization for noise robust speech recognition," Speech Commun., vol.25, pp. 133-147, 1998.
- (1998) Speech Commun. , vol.25 , pp. 133-147
- Viikki, O.¹ Laurila, K.²

4
- 0142009990
- Non-linear feature extraction for robust recognition in stationary and non-stationary noise
- Q. Zhu and A. Alwan, "Non-linear feature extraction for robust recognition in stationary and non-stationary noise," Computer, Speech, Lang., vol.17, no.4, pp. 381-402, 2003.
- (2003) Computer, Speech, Lang. , vol.17 , Issue.4 , pp. 381-402
- Zhu, Q.¹ Alwan, A.²

5
- 85032752225
- Missing feature approaches in speech recognition
- Sep.
- B. Raj and R. Stern, "Missing feature approaches in speech recognition," IEEE Signal Process. Mag., vol.22, no.5, pp. 101-116, Sep. 2005.
- (2005) IEEE Signal Process. Mag. , vol.22 , Issue.5 , pp. 101-116
- Raj, B.¹ Stern, R.²

6
- 0030671924
- Missing data techniques for robust speech recognition
- M. P. Cooke, A. Morris, and P. D. Green, "Missing data techniques for robust speech recognition," in Proc. ICASSP, 1997, vol.2, pp. 863-866.
- (1997) Proc. ICASSP , vol.2 , pp. 863-866
- Cooke, M.P.¹ Morris, A.² Green, P.D.³

7
- 85009063707
- Soft decisions in missing feature data techniques for robust automatic speech recognition
- J. Barker, L. Josifovski, M. Cooke, and P. Green, "Soft decisions in missing feature data techniques for robust automatic speech recognition," in Proc. ICSLP, 2000, pp. 373-376.
- (2000) Proc. ICSLP , pp. 373-376
- Barker, J.¹ Josifovski, L.² Cooke, M.³ Green, P.⁴

8
- 0035342414
- Robust automatic speech recognition with missing and unreliable acoustic data
- M. Cooke, P. Green, L. Josifovski, and A. Vizinho, "Robust automatic speech recognition with missing and unreliable acoustic data," Speech Commun., vol.34, pp. 267-285, 2001.
- (2001) Speech Commun. , vol.34 , pp. 267-285
- Cooke, M.¹ Green, P.² Josifovski, L.³ Vizinho, A.⁴

9
- 33750291256
- Uncertainty decoding for distributed speech recognition over error-prone networks
- V. Ion and R. Haeb-Umbach, "Uncertainty decoding for distributed speech recognition over error-prone networks," Speech Commun., vol.48, pp. 1435-1446, 2006.
- (2006) Speech Commun. , vol.48 , pp. 1435-1446
- Ion, V.¹ Haeb-Umbach, R.²

10
- 4544371139
- Ph.D. dissertation, Carnegie Mellon Univ., Pittsburgh, PA
- B. R. Ramakrishnan, "Reconstruction of incomplete spectrograms for robust speech recognition," Ph.D. dissertation, Carnegie Mellon Univ., Pittsburgh, PA, 2000.
- (2000) Reconstruction of Incomplete Spectrograms for Robust Speech Recognition
- Ramakrishnan, B.R.¹

11
- 4644336054
- Reconstruction of missing features for robust speech recognition
- B. Raj, M. L. Seltzer, and R. M. Stern, "Reconstruction of missing features for robust speech recognition," Speech Commun., vol.43, pp. 275-296, 2004.
- (2004) Speech Commun. , vol.43 , pp. 275-296
- Raj, B.¹ Seltzer, M.L.² Stern, R.M.³

12
- 0024610919
- A tutorial on hidden markov models and selected applications in speech recognition
- Feb.
- L. R. Rabiner, "A tutorial on hidden markov models and selected applications in speech recognition," Proc. IEEE, vol.77, no.2, pp. 257-286, Feb. 1989.
- (1989) Proc. IEEE , vol.77 , Issue.2 , pp. 257-286
- Rabiner, L.R.¹

13
- 0015143526
- Convolutional codes and their performance in communication systems
- Oct.
- A. Viterbi, "Convolutional codes and their performance in communication systems," IEEE Trans. Commun., vol.COM-19, no.5, pt. 1, pp. 751-772, Oct. 1971.
- (1971) IEEE Trans. Commun. , vol.COM-19 , Issue.5 PART 1 , pp. 751-772
- Viterbi, A.¹

14
- 0004244302
- Englewood Cliffs: Prentice-Hall
- L. Rabiner and B. H. Juang, Fundamentals of Speech Recognition. Englewood Cliffs: Prentice-Hall, 1993.
- (1993) Fundamentals of Speech Recognition
- Rabiner, L.¹ Juang, B.H.²

15
- 85009089779
- MMSE-based channel mitigation for distributed speech recognition
- A. M. Peinado, V. Sanchez, J. C. Segura, and J. L. Perez-Cordoba, "MMSE-based channel mitigation for distributed speech recognition," in Proc. Eurospeech, 2001.
- (2001) Proc. Eurospeech
- Peinado, A.M.¹ Sanchez, V.² Segura, J.C.³ Perez-Cordoba, J.L.⁴

16
- 85009285110
- HMM-based methods for channel error mitigation in distributed speech recognition
- A. M. Peinado, V. Sanchez, J. L. Perez-Cordoba, J. C. Segura, and J. Rubio, "HMM-based methods for channel error mitigation in distributed speech recognition," in Proc. ICSLP, 2002.
- (2002) Proc. ICSLP
- Peinado, A.M.¹ Sanchez, V.² Perez-Cordoba, J.L.³ Segura, J.C.⁴ Rubio, J.⁵

17
- 0242721421
- HMM-based channel error mitigation and its application to distributed speech recognition
- Nov.
- A. M. Peinado, V. Sanchez, J. L. Perez-Cordoba, and A. de la Torre, "HMM-based channel error mitigation and its application to distributed speech recognition," Speech Commun., vol.41/4, pp. 549-561, Nov. 2003.
- (2003) Speech Commun. , vol.41 , Issue.4 , pp. 549-561
- Peinado, A.M.¹ Sanchez, V.² Perez-Cordoba, J.L.³ De La Torre, A.⁴

18
- 33744969796
- Hidden markov model based loss concealment for voice over IP
- Sep.
- C. A. Rodbro, M. N. Murthi, S. V. Andersen, and S. H. Jensen, "Hidden markov model based loss concealment for voice over IP," Trans. Audio, Speech, Lang. Process., vol.14, no.5, pp. 1609-1623, Sep. 2006.
- (2006) Trans. Audio, Speech, Lang. Process. , vol.14 , Issue.5 , pp. 1609-1623
- Rodbro, C.A.¹ Murthi, M.N.² Andersen, S.V.³ Jensen, S.H.⁴

19
- 51449120334
- An efficient approximation of the forward-backward algorithm to deal with packet loss, with applications to remote speech recognition
- B. J. Borgstrom and A. Alwan, "An efficient approximation of the forward-backward algorithm to deal with packet loss, with applications to remote speech recognition," in Proc. ICASSP, 2008, pp. 4425-4428.
- (2008) Proc. ICASSP , pp. 4425-4428
- Borgstrom, B.J.¹ Alwan, A.²

20
- 84867196386
- HMM-based estimation of unreliable spectral components for noise robust speech recognition
- B. J. Borgstrom and A. Alwan, "HMM-based estimation of unreliable spectral components for noise robust speech recognition," in Proc. In-terspeech, 2008, pp. 1769-1772.
- (2008) Proc. In-terspeech , pp. 1769-1772
- Borgstrom, B.J.¹ Alwan, A.²

21
- 19944382585
- Enabling new speech driven services for mobile devices: An overview of the ETSI standards activities for distributed speech recognition front-ends
- May
- D. Pearce, "Enabling new speech driven services for mobile devices: An overview of the ETSI standards activities for distributed speech recognition front-ends," in Proc. AVIOS 2000: Speech Appl. Conf., May 2000, vol.5, pp. 1-6.
- (2000) Proc. AVIOS 2000: Speech Appl. Conf. , vol.5 , pp. 1-6
- Pearce, D.¹

22
- 0035396555
- Noise power spectral density estimation based on optimal smoothing and minimum statistics
- Jul.
- R. Martin, "Noise power spectral density estimation based on optimal smoothing and minimum statistics," IEEE Trans. Speech Audio Process., vol.9, no.5, pp. 504-512, Jul. 2001.
- (2001) IEEE Trans. Speech Audio Process. , vol.9 , Issue.5 , pp. 504-512
- Martin, R.¹

23
- 0041360463
- Noise spectrum estimation in adverse environments: Improved minima controlled recursive averaging
- Sep.
- I. Cohen, "Noise spectrum estimation in adverse environments: Improved minima controlled recursive averaging," IEEE Trans. Speech Audio Process., vol.11, no.5, pp. 466-475, Sep. 2003.
- (2003) IEEE Trans. Speech Audio Process. , vol.11 , Issue.5 , pp. 466-475
- Cohen, I.¹

24
- 4644317224
- A Bayesian classifier for spec-trographic mask estimation for missing feature speech recognition
- M. L. Seltzer, B. Raj, and R. M. Stern, "A Bayesian classifier for spec-trographic mask estimation for missing feature speech recognition," Speech Commun., vol.43, pp. 379-393, 2004.
- (2004) Speech Commun. , vol.43 , pp. 379-393
- Seltzer, M.L.¹ Raj, B.² Stern, R.M.³

25
- 66149130450
- Multi-resolution soft-features for channel-robust distributed speech recognition
- V. Ion and R. Haeb-Umbach, "Multi-resolution soft-features for channel-robust distributed speech recognition," in Proc. Interspeech, 2007, pp. 594-597.
- (2007) Proc. Interspeech , pp. 594-597
- Ion, V.¹ Haeb-Umbach, R.²

26
- 33947703708
- Band-independent mask estimation for missing feature reconstruction in the presence of unknown background noise
- W. Kim and R. Stern, "Band-independent mask estimation for missing feature reconstruction in the presence of unknown background noise," in Proc. ICASSP, 2006, pp. 305-308.
- (2006) Proc. ICASSP , pp. 305-308
- Kim, W.¹ Stern, R.²

27
- 66149120230
- Improved a posteriori speech presence probability estimation based on a likelihood ratio with fixed priors
- Jul.
- T. Gerkmann, C. Breithaupt, and R. Martin, "Improved a posteriori speech presence probability estimation based on a likelihood ratio with fixed priors," IEEE Trans. Audio, Speech, Lang. Process., vol.16, no.5, pp. 910-919, Jul. 2008.
- (2008) IEEE Trans. Audio, Speech, Lang. Process. , vol.16 , Issue.5 , pp. 910-919
- Gerkmann, T.¹ Breithaupt, C.² Martin, R.³

28
- 69249139982
- Conditionally linear Gaussian models for estimating vocal tract resonances
- D. Rudoy, D. Spendley, and P. Wolfe, "Conditionally linear Gaussian models for estimating vocal tract resonances," in Proc. Interspeech, 2007, pp. 526-529.
- (2007) Proc. Interspeech , pp. 526-529
- Rudoy, D.¹ Spendley, D.² Wolfe, P.³

29
- 0002603206
- Missing data theory, spectral subtraction, and signal-to-noise estimation for robust ASR: An integrated study
- A. Vizinho, P. Green, M. Cooke, and L. Josifovski, "Missing data theory, spectral subtraction, and signal-to-noise estimation for robust ASR: An integrated study," in Proc. Eurospeech, 1999.
- (1999) Proc. Eurospeech
- Vizinho, A.¹ Green, P.² Cooke, M.³ Josifovski, L.⁴

30
- 84889302357
- New York: Wiley
- P. Vary and R. Martin, Digital Speech Transmission: Enhancement, Coding and Error Concealment. New York: Wiley, 2006.
- (2006) Digital Speech Transmission: Enhancement Coding and Error Concealment
- Vary, P.¹ Martin, R.²

31
- 66149149969
- Optimal recursive smoothing of non-stationary periodograms
- R. Martin and T. Lotter, "Optimal recursive smoothing of non-stationary periodograms," in Proc. Int. Workshop Acoust. Echo, Noise Control (IWAENC), 2001, pp. 167-170.
- (2001) Proc. Int. Workshop Acoust. Echo, Noise Control (IWAENC) , pp. 167-170
- Martin, R.¹ Lotter, T.²

32
- 0004185151
- Wiley
- J. A. Hartigan, Clustering Algorithms.: Wiley, 1975.
- (1975) Clustering Algorithms
- Hartigan, J.A.¹

33
- 0018455310
- Suppression of acoustic noise in speech using spectral subtraction
- Apr.
- S. F. Boll, "Suppression of acoustic noise in speech using spectral subtraction," IEEE Trans. Acoustics, Speech, Signal Process., vol.ASSP-27, no.2, pp. 113-120, Apr. 1979.
- (1979) IEEE Trans. Acoustics, Speech, Signal Process. , vol.ASSP-27 , Issue.2 , pp. 113-120
- Boll, S.F.¹

34
- 0003409571
- Englwood Cliffs: Prentice-Hall
- A. K. Jain, Fundamentals of Digital Image Processing. Englwood Cliffs: Prentice-Hall, 1988.
- (1988) Fundamentals of Digital Image Processing
- Jain, A.K.¹

35
- 85009070292
- Large-vocabulary speech recognition under adverse acoustic environments
- L. Deng, A. Acero, M. Plumpe, and X. Huang, "Large-vocabulary speech recognition under adverse acoustic environments," in Proc. ICSLP, 2000.
- (2000) Proc. ICSLP
- Deng, L.¹ Acero, A.² Plumpe, M.³ Huang, X.⁴

36
- 33746753361
- Adaptation of children's speech with limited data based on formant-like peak alignment
- X. Cui and A. Alwan, "Adaptation of children's speech with limited data based on formant-like peak alignment," Comput. Speech, Lang., vol.20, no.4, pp. 400-419, 2006.
- (2006) Comput. Speech, Lang. , vol.20 , Issue.4 , pp. 400-419
- Cui, X.¹ Alwan, A.²

37
- 0142009990
- Non-linear feature extraction for robust recognition in stationary and non-stationary noise
- Q. Zhu and A. Alwan, "Non-linear feature extraction for robust recognition in stationary and non-stationary noise," Comput. Speech, Lang., vol.17, no.4, pp. 381-402, 2003.
- (2003) Comput. Speech, Lang. , vol.17 , Issue.4 , pp. 381-402
- Zhu, Q.¹ Alwan, A.²

38
- 0021892216
- Speech enhancement using a minimum mean-square log-spectral amplitude estimator
- Apr.
- Y. Ephraim and D. Malah, "Speech enhancement using a minimum mean-square log-spectral amplitude estimator," IEEE Trans. Acoust., Speech, Signal Process., vol.33, no.2, pp. 443-445, Apr. 1985.
- (1985) IEEE Trans. Acoust., Speech, Signal Process. , vol.33 , Issue.2 , pp. 443-445
- Ephraim, Y.¹ Malah, D.²

39
- 0031238095
- A model of dynamic auditory perception and its application to robust word recognition
- Sep.
- B. Strope and A. Alwan, "A model of dynamic auditory perception and its application to robust word recognition," IEEE Trans. Speech Audio Process., vol.5, no.5, pp. 451-464, Sep. 1997.
- (1997) IEEE Trans. Speech Audio Process. , vol.5 , Issue.5 , pp. 451-464
- Strope, B.¹ Alwan, A.²

40
- 0025041264
- Perceptual Linear Predictive (PLP) analysis of speech
- H. Hermansky, "Perceptual Linear Predictive (PLP) analysis of speech," JASA, vol.87, no.4, pp. 1738-1752, 1990.
- (1990) JASA , vol.87 , Issue.4 , pp. 1738-1752
- Hermansky, H.¹

41
- 38849170676
- Distributed Speech Recognition; Front-End Feature Extraction Algorithms; Compression Algorithms, ETSI ES 202 050 v1.1.1, 2007-2010, ETSI Standard Doc.
- Speech Processing, Transmission, and Quality Aspects (STQ); Distributed Speech Recognition; Front-End Feature Extraction Algorithms; Compression Algorithms, ETSI ES 202 050 v1.1.1, 2007-2010, ETSI Standard Doc.
- Speech Processing Transmission, and Quality Aspects (STQ)

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.