SCOPUS 정보 검색 플랫폼

IEEE Transactions on Audio, Speech and Language Processing

Volumn 19, Issue 5, 2011, Pages 1434-1443

A novel mask estimation method employing posterior-based representative mean estimate for missing-feature speech recognition

(2) Kim, Wooil a Hansen, John H L a

a The University of Texas at Dallas (United States)

Author keywords

Background noise; mask estimation; missing feature; posterior based representative mean (PRM) estimate; robust speech recognition

Indexed keywords

BACKGROUND NOISE; COMPONENT CLASSIFIERS; ESTIMATION METHODS; FEATURE COMPENSATION; IN-VEHICLE; MISSING-FEATURE; MODEL COMBINATION; NOISE CONDITIONS; NOISE SIGNALS; PERFORMANCE EVALUATION; POSTERIOR PROBABILITY; POSTERIOR-BASED REPRESENTATIVE MEAN (PRM) ESTIMATE; RECOGNITION PERFORMANCE; ROBUST SPEECH RECOGNITION; SPECTRAL COMPONENTS; SPECTRAL SUBTRACTIONS; SPEECH MODELS; SPEECH RECOGNITION PERFORMANCE; SPEECH UTTERANCE; WEIGHTED SUM; WORD ERROR RATE;

ESTIMATION; FEATURE EXTRACTION;

SPEECH RECOGNITION;

EID: 79956289561 PISSN: 15587916 EISSN: None Source Type: Journal
DOI: 10.1109/TASL.2010.2091633 Document Type: Article

Times cited : (12)

References (43)

1
- 77949404049
- Mask estimation employing posterior- based representative mean for missing-feature speech recognition with time-varying background noise
- Merano, Italy, Dec.
- W. Kim and J. H. L. Hansen, "Mask estimation employing posterior- based representative mean for missing-feature speech recognition with time-varying background noise," in Proc. IEEE ASRU'09, Merano, Italy, Dec. 2009, pp. 194-198.
- (2009) Proc. IEEE ASRU'09 , pp. 194-198
- Kim, W.¹ Hansen, J.H.L.²

2
- 56949089751
- Feature compensation in the cepstral domain employing model combination
- W. Kim and J. H. L. Hansen, "Feature compensation in the cepstral domain employing model combination," Speech Commun., vol. 51, no. 2, pp. 83-96, 2009.
- (2009) Speech Commun. , vol.51 , Issue.2 , pp. 83-96
- Kim, W.¹ Hansen, J.H.L.²

3
- 0004319968
- The NOISEX-92 study on the effect of additive noise on automatic speech recognition
- Malvern, U.K., (Available from NOISEX-92 CD-ROMS)
- A. P. Varga, H. J. M. Steeneken, M. Tomlinson, and D. Jones, "The NOISEX-92 study on the effect of additive noise on automatic speech recognition," in Tech. Rep., Speech Res. Unit, Defense Res. Agency, Malvern, U.K., 1992, (Available from NOISEX-92 CD-ROMS).
- (1992) Tech. Rep., Speech Res. Unit, Defense Res. Agency
- Varga, A.P.¹ Steeneken, M.H.J.² Tomlinson, M.³ Jones, D.⁴

4
- 85135275880
- The Speechdat-Car multilingual speech databases for in-car applications: Some first validation results
- Sep.
- H. Heuvel, J. Boudy, R. Comeyne, S. Euler,A.Moreno, and G. Richard, "The Speechdat-Car multilingual speech databases for in-car applications: Some first validation results," in Proc. Eurospeech'99, Sep. 1999.
- (1999) Proc. Eurospeech'99
- Heuvel, H.¹ Boudy, J.² Comeyne, R.³ Euler, S.⁴ Moreno, A.⁵ Richard, G.⁶

5
- 0141477988
- Speech in noisy environments (SPINE) adds news dimension to speech recognition R&D
- San Diego, CA, Mar.
- T. Crystal, A. Schmidt-Nelson, and E. Marsh, "Speech in noisy environments (SPINE) adds news dimension to speech recognition R&D," in Proc. HLT Conf., San Diego, CA, Mar. 2002.
- (2002) Proc. HLT Conf.
- Crystal, T.¹ Schmidt-Nelson, A.² Marsh, E.³

6
- 47849129173
- UTDrive: Driver behavior and speech interactive systems for in-vehicle environments
- P. Angkititrakul, M. Petracca, A. Sathyanarayana, and J. H. L. Hansen, "UTDrive: Driver behavior and speech interactive systems for in-vehicle environments," in Proc. IEEE Intell. Veh. Conf., 2007, pp. 566-569.
- (2007) Proc. IEEE Intell. Veh. Conf. , pp. 566-569
- Angkititrakul, P.¹ Petracca, M.² Sathyanarayana, A.³ Hansen, J.H.L.⁴

7
- 34047274021
- CU-move: Advances for in-vehicle speech systems for route navigation
- Abut, J. H. L. Hansen, and Takeda, Eds. New York: Springer., ch. 2
- J. H. L. Hansen, X. Zhang, M. Akbacak, U. Yapanel, B. Pellom, W. Ward, and P. Angkititrakul, "CU-move: Advances for in-vehicle speech systems for route navigation," in DSP for In-Vehicle and Mobile Systems, Abut, J. H. L. Hansen, and Takeda, Eds. New York: Springer., 2004, ch. 2.
- (2004) DSP for In-Vehicle and Mobile Systems
- Hansen, J.H.L.¹ Zhang, X.² Akbacak, M.³ Yapanel, U.⁴ Pellom, B.⁵ Ward, W.⁶ Angkititrakul, P.⁷

8
- 85008020310
- Speechfind: Advances in spoken document retrieval for a National Gallery of the Spoken Word
- Sep.
- J. H. L. Hansen, R. Huang, B. Zhou, M. Seadle, J. R. Deller Jr., A. R. Gurijala, M. Kurimo, and P. Angkititrakul, "Speechfind: Advances in spoken document retrieval for a National Gallery of the SpokenWord," IEEE Trans. Speech Audio Process., vol. 13, no. 5, pp. 712-730, Sep. 2005.
- (2005) IEEE Trans. Speech Audio Process. , vol.13 , Issue.5 , pp. 712-730
- Hansen, J.H.L.¹ Huang, R.² Zhou, B.³ Seadle, M.⁴ Deller Jr., J.R.⁵ Gurijala, A.R.⁶ Kurimo, M.⁷ Angkititrakul, P.⁸

9
- 44849128293
- Speechfind for CDP: Advances in spoken document retrieval for the U.S. collaborative digitization program
- W. Kim and J. H. L. Hansen, "Speechfind for CDP: Advances in spoken document retrieval for the U.S. collaborative digitization program," in Proc. IEEE ASRU2007, 2007, pp. 687-692.
- (2007) Proc. IEEE ASRU2007 , pp. 687-692
- Kim, W.¹ Hansen, J.H.L.²

10
- 0030283741
- Analysis and compensation of speech under stress and noise for environmental robustness in speech recognition
- J. H. L. Hansen, "Analysis and compensation of speech under stress and noise for environmental robustness in speech recognition," Speech Commun., vol. 20, no. 2, pp. 151-170, 1996.
- (1996) Speech Commun. , vol.20 , Issue.2 , pp. 151-170
- Hansen, J.H.L.¹

11
- 0018455310
- Suppression of acoustic noise in speech using spectral subtraction
- Apr.
- S. F. Boll, "Suppression of acoustic noise in speech using spectral subtraction," IEEE Trans. Acoust., Speech, Signal Process., vol. ASSP-27, no. 2, pp. 113-120, Apr. 1979.
- (1979) IEEE Trans. Acoust., Speech, Signal Process. , vol.ASSP-27 , Issue.2 , pp. 113-120
- Boll, S.F.¹

12
- 0021645331
- Speech enhancement using minimum mean square error short time spectral amplitude estimator
- Dec.
- Y. Ephraim and D. Malah, "Speech enhancement using minimum mean square error short time spectral amplitude estimator," IEEE Trans. Acoust., Speech, Signal Process., vol. ASSP-32, no. 6, pp. 1109-1121, Dec. 1984.
- (1984) IEEE Trans. Acoust., Speech, Signal Process. , vol.ASSP-32 , Issue.6 , pp. 1109-1121
- Ephraim, Y.¹ Malah, D.²

13
- 0026135903
- Constrained iterative speech enhancement with application to speech recognition
- Apr.
- J. H. L. Hansen and M. Clements, "Constrained iterative speech enhancement with application to speech recognition," IEEE Trans. Signal Process., vol. 39, no. 4, pp. 795-805, Apr. 1991.
- (1991) IEEE Trans. Signal Process. , vol.39 , Issue.4 , pp. 795-805
- Hansen, J.H.L.¹ Clements, M.²

14
- 0028516405
- Morphological constrained enhancement with adaptive cepstral compensation (MCE-ACC) for speech recognition in noise and Lombard effect
- Oct.
- J. H. L. Hansen, "Morphological constrained enhancement with adaptive cepstral compensation (MCE-ACC) for speech recognition in noise and Lombard effect," IEEE Trans. Speech Audio Process., vol. 2, no. 4, pp. 598-614, Oct. 1994.
- (1994) IEEE Trans. Speech Audio Process. , vol.2 , Issue.4 , pp. 598-614
- Hansen, J.H.L.¹

15
- 0032116601
- Data-driven environmental compensation for speech recognition: A unified approach
- PII S0167639398000259
- P. J. Moreno, B. Raj, and R. M. Stern, "Data-driven environmental compensation for speech recognition: A unified approach," Speech Commun., vol. 24, no. 4, pp. 267-285, 1998. (Pubitemid 128424259)
- (1998) Speech Communication , vol.24 , Issue.4 , pp. 267-285
- Moreno, P.J.¹ Raj, B.² Stern, R.M.³

16
- 0036642712
- Feature domain compensation of nonstationary noise for robust speech recognition
- DOI 10.1016/S0167-6393(01)00013-9, PII S0167639301000139
- N. S. Kim, "Feature domain compensation of nonstationary noise for robust speech recognition," Speech Commun., vol. 37, pp. 231-248, 2002. (Pubitemid 34524841)
- (2002) Speech Communication , vol.37 , Issue.3-4 , pp. 231-248
- Kim, N.S.¹

17
- 4544288024
- Joint removal of additive and convolutional noise with model-based feature enhancement
- V. Stouten, H.Van hamme, and P.Wambacq, "Joint removal of additive and convolutional noise with model-based feature enhancement," in Proc. ICASSP'04, 2004, pp. 949-952.
- (2004) Proc. ICASSP'04 , pp. 949-952
- Stouten, V.¹ Vanhamme, H.² Wambacq, P.³

18
- 38749150838
- HMM-based feature compensation methods: An evaluation using the Aurora2
- A. Sasou, T. Tanaka, S. Nakamura, and F. Asano, "HMM-based feature compensation methods: An evaluation using the Aurora2," in Proc. ICSLP'04, 2004, pp. 121-124.
- (2004) Proc. ICSLP'04 , pp. 121-124
- Sasou, A.¹ Tanaka, T.² Nakamura, S.³ Asano, F.⁴

19
- 0028419019
- Maximum a posteriori estimation for multivariate Gaussian mixture observations of Markov chains
- Apr.
- J. L. Gauvain and C. H. Lee, "Maximum a posteriori estimation for multivariate Gaussian mixture observations of Markov chains," IEEE Trans. Speech Audio Process., vol. 2, no. 2, pp. 291-298, Apr. 1994.
- (1994) IEEE Trans. Speech Audio Process. , vol.2 , Issue.2 , pp. 291-298
- Gauvain, J.L.¹ Lee, C.H.²

20
- 0029288633
- Maximum likelihood linear regression for speaker adaptation of continuous density HMMs
- C. J. Leggetter and P. C. Woodland, "Maximum likelihood linear regression for speaker adaptation of continuous density HMMs," Comput. Speech Lang., vol. 9, pp. 171-185, 1995.
- (1995) Comput. Speech Lang. , vol.9 , pp. 171-185
- Leggetter, C.J.¹ Woodland, P.C.²

21
- 0030245128
- Robust continuous speech recognition using parallel model combination
- PII S1063667696067120
- M. J. F. Gales and S. J. Young, "Robust continuous speech recognition using parallel model combination," IEEE Trans. Speech Audio Process., vol. 4, no. 5, pp. 352-359, Sep. 1996. (Pubitemid 126753023)
- (1996) IEEE Transactions on Speech and Audio Processing , vol.4 , Issue.5 , pp. 352-359
- Gales, M.J.F.¹ Young, S.J.²

22
- 85009106519
- Robust ASR based on clean speech models: An evaluation of missing data techniques for connected digit recognition in noise
- J. Barker,M. Cooke, and P. Green, "Robust ASR based on clean speech models: An evaluation of missing data techniques for connected digit recognition in noise," in Proc. Eurospeech'01, 2001, pp. 213-216.
- (2001) Proc. Eurospeech'01 , pp. 213-216
- Barker, J.¹ Cooke, M.² Green, P.³

23
- 0035342414
- Robust automatic speech recognition with missing and unreliable acoustic data
- DOI 10.1016/S0167-6393(00)00034-0, PII S0167639300000340
- M. Cook, P. Green, L. Josifovski, and A. Vizinho, "Robust automatic speech recognition with missing and unreliable acoustic data," Speech Commun., vol. 34, no. 3, pp. 267-285, 2001. (Pubitemid 32284867)
- (2001) Speech Communication , vol.34 , Issue.3 , pp. 267-285
- Cooke, M.¹ Green, P.² Josifovski, L.³ Vizinho, A.⁴

24
- 2942539074
- Techniques for handling convolutional distortion with missing data automatic speech recognition
- K. J. Palomaki, G. J. Brown, and J. P. Barker, "Techniques for handling convolutional distortion with missing data automatic speech recognition," Speech Commun., vol. 43, pp. 123-142, 2004.
- (2004) Speech Commun. , vol.43 , pp. 123-142
- Palomaki, K.J.¹ Brown, G.J.² Barker, J.P.³

25
- 4644336054
- Reconstruction of missing features for robust speech recognition
- B. Raj, M. L. Seltzer, and R. M. Stern, "Reconstruction of missing features for robust speech recognition," Speech Commun., vol. 43, no. 4, pp. 275-296, 2004.
- (2004) Speech Commun. , vol.43 , Issue.4 , pp. 275-296
- Raj, B.¹ Seltzer, M.L.² Stern, R.M.³

26
- 4544315110
- Robust speech recognition using cepstral domain missing data techniques and noisy masks
- May
- H. Van Hamme, "Robust speech recognition using cepstral domain missing data techniques and noisy masks," in Proc. ICASSP'04, May 2004, pp. 213-216.
- (2004) Proc. ICASSP'04 , pp. 213-216
- Van Hamme, H.¹

27
- 85032752225
- Missing-feature approaches in speech recognition
- DOI 10.1109/MSP.2005.1511828
- B. Raj and R. M. Stern, "Missing-feature approaches in speech recognition," IEEE Signal Process. Mag., vol. 22, no. 5, pp. 101-116, Sep. 2005. (Pubitemid 41488524)
- (2005) IEEE Signal Processing Magazine , vol.22 , Issue.5 , pp. 101-116
- Raj, B.¹ Stern, R.M.²

28
- 44849096627
- Missing-feature reconstruction for bandlimited speech recognition in spoken document retrieval
- Sep.
- W. Kim and J. H. L. Hansen, "Missing-feature reconstruction for bandlimited speech recognition in spoken document retrieval," in Proc. Interspeech' 06, Sep. 2006, pp. 2306-2309.
- (2006) Proc. Interspeech' , vol.6 , pp. 2306-2309
- Kim, W.¹ Hansen, J.H.L.²

29
- 68549126848
- Time-frequency correlation based missing-feature reconstruction for robust speech recognition in band-restricted conditions
- Sep.
- W. Kim and J. H. L. Hansen, "Time-frequency correlation based missing-feature reconstruction for robust speech recognition in band-restricted conditions," IEEE Trans. Audio, Speech, Lang. Process., vol. 17, no. 7, pp. 1292-1304, Sep. 2009.
- (2009) IEEE Trans. Audio, Speech, Lang. Process. , vol.17 , Issue.7 , pp. 1292-1304
- Kim, W.¹ Hansen, L.J.H.²

30
- 0002603206
- Missing data theory, spectral subtraction and signal-to-Noise estimation for robust ASR: An integrated study
- Sep.
- A. Vizinho, P. Green, M. M. Cooke, and L. Josifovski, "Missing data theory, spectral subtraction and signal-to-Noise estimation for robust ASR: An integrated study," in Proc. Eurospeech'99, Sep. 1999, pp. 2407-2410.
- (1999) Proc. Eurospeech'99 , pp. 2407-2410
- Vizinho, A.¹ Green, P.² Cooke, M.M.³ Josifovski, L.⁴

31
- 4644317224
- A Bayesian classifier for spectrographic mask estimation for missing-feature speech recognition
- M. L. Seltzer, B. Raj, and R. M. Stern, "A Bayesian classifier for spectrographic mask estimation for missing-feature speech recognition," Speech Commun., vol. 43, no. 4, pp. 379-393, 2004.
- (2004) Speech Commun. , vol.43 , Issue.4 , pp. 379-393
- Seltzer, M.L.¹ Raj, B.² Stern, R.M.³

32
- 33745200501
- Environment-independent mask estimation for missing-feature reconstruction
- 9th European Conference on Speech Communication and Technology, Eurospeech Interspeech
- W. Kim, R. M. Stern, and H. Ko, "Environment-independent mask estimation for missing-feature reconstruction," in Proc. Interspeech'05, Sep. 2005, pp. 2637-2640. (Pubitemid 43908637)
- (2005) 9th European Conference on Speech Communication and Technology , pp. 2637-2640
- Kim, W.¹ Stern, R.M.² Ko, H.³

33
- 33947703708
- Band-independent mask estimation for missing-feature reconstruction in the presence of unknown background noise
- May
- W. Kim and R. M. Stern, "Band-independent mask estimation for missing-feature reconstruction in the presence of unknown background noise," in Proc. ICASSP'06, May 2006, pp. 305-308.
- (2006) Proc. ICASSP'06 , pp. 305-308
- Kim, W.¹ Stern, R.M.²

34
- 85009165807
- High-likelihood model based on reliability statistics for robust combination of features: Application to noisy speech recognition
- P. Jancovic, M.Kokuer, and F. Murtagh, "High-likelihood model based on reliability statistics for robust combination of features: Application to noisy speech recognition," in Proc. Eurospeech'03, 2003, pp. 2161-2164.
- (2003) Proc. Eurospeech'03 , pp. 2161-2164
- Jancovic, P.¹ Kokuer, M.² Murtagh, F.³

35
- 33646780537
- Mask estimation based on sound localisation for missing data speech recognition
- S. Harding, J. Barker, and G. J. Brown, "Mask estimation based on sound localisation for missing data speech recognition," in Proc. ICASSP'05, 2005, pp. 537-540.
- (2005) Proc. ICASSP'05 , pp. 537-540
- Harding, S.¹ Barker, J.² Brown, G.J.³

36
- 33750311718
- Binary and ratio time-frequency masks for robust speech recognition
- DOI 10.1016/j.specom.2006.09.003, PII S0167639306001129
- S. Srinivasan, N. Roman, and D. Wang, "Binary and ratio time-frequency masks for robust speech recognition," Speech Commun., vol. 48, no. 11, pp. 1486-1501, 2006. (Pubitemid 44634774)
- (2006) Speech Communication , vol.48 , Issue.11 , pp. 1486-1501
- Srinivasan, S.¹ Roman, N.² Wang, D.³

37
- 55049096120
- Spatial separation of speech signals using amplitude estimation based on interaural comparisons of zero crossings
- H. Park and R. M. Stern, "Spatial separation of speech signals using amplitude estimation based on interaural comparisons of zero crossings," Speech Commun., vol. 51, no. 1, pp. 15-25, 2009.
- (2009) Speech Commun. , vol.51 , Issue.1 , pp. 15-25
- Park, H.¹ Stern, R.M.²

38
- 0038669544
- The AURORA experimental framework for the performance evaluations of speech recognition systems under noisy conditions
- H. G. Hirsch and D. Pearce, "The AURORA experimental framework for the performance evaluations of speech recognition systems under noisy conditions," in Proc. ISCA ITRW ASR2000, 2000.
- (2000) Proc. ISCA ITRW ASR2000
- Hirsch, H.G.¹ Pearce, D.²

39
- 77953863507
- ETSI ES 201 108 v1.1.2 (2000-04)
- ETSI Standard Document, ETSI ES 201 108 v1.1.2 (2000-04), 2000.
- (2000) ETSI Standard Document

40
- 79956265286
- [Online], Available
- [Online]. Available: http://spib.rice.edu/spib/select-noise.html

41
- 0003089362
- Spectral subtraction based on minimum statistics
- R. Martin, "Spectral subtraction based on minimum statistics," in Proc. EUSIPCO-94, 1994, pp. 1182-1185.
- (1994) Proc. EUSIPCO-94 , pp. 1182-1185
- Martin, R.¹

42
- 79956264228
- ETSI ES 202 050 v1.1.1 (2002-10)
- ETSI Standard Document, ETSI ES 202 050 v1.1.1 (2002-10), 2002.
- (2002) ETSI Standard Document

43
- 79956267259
- NIST SPeech Quality Assurance (SPQA) Package Version 2.3 [Online], Available
- NIST SPeech Quality Assurance (SPQA) Package Version 2.3 [Online]. Available: http://www.nist.gov/speech

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.