SCOPUS 정보 검색 플랫폼

Volumn 48, Issue 11, 2006, Pages 1486-1501

Binary and ratio time-frequency masks for robust speech recognition

(3) Srinivasan, Soundararajan a Roman, Nicoleta b Wang, DeLiang b

Author keywords

Binaural processing; Ideal binary mask; Missing data recognizer; Ratio mask; Robust speech recognition; Speech segregation

Indexed keywords

DATA REDUCTION; ROBUSTNESS (CONTROL SYSTEMS); SPEECH COMMUNICATION; TIME DOMAIN ANALYSIS;

BINAURAL PROCESSING; IDEAL BINARY MASK; MISSING-DATA RECOGNIZER; RATIO MASK; ROBUST SPEECH RECOGNITION; SPEECH SEGREGATION; SPEECH SIGNALS;

SPEECH RECOGNITION;

EID: 33750311718 PISSN: 01676393 EISSN: None Source Type: Journal
DOI: 10.1016/j.specom.2006.09.003 Document Type: Article

Times cited : (222)

References (43)

1
- 85009063707
- Barker, J., Josifovski, L., Cooke, M., Green, P., 2000. Soft decisions in missing data techniques for robust automatic speech recognition. In: Proc. International Conference on Spoken Language Processing '00, pp. 373-376.

2
- 0003742220
- MIT Press, Cambridge, MA
- Blauert J. Spatial Hearing - The Psychophysics of Human Sound Localization (1997), MIT Press, Cambridge, MA
- (1997) Spatial Hearing - The Psychophysics of Human Sound Localization
- Blauert, J.¹

3
- 0018455310
- Suppression of acoustic noise in speech using spectral subtraction
- Boll S.F. Suppression of acoustic noise in speech using spectral subtraction. IEEE Trans. Acoust. Speech Signal Processing ASSP-27 2 (1979) 113-120
- (1979) IEEE Trans. Acoust. Speech Signal Processing , vol.ASSP-27 , Issue.2 , pp. 113-120
- Boll, S.F.¹

4
- 0003980102
- Bradstein M., and Ward D. (Eds), Springer, Berlin, Germany
- In: Bradstein M., and Ward D. (Eds). Microphone Arrays: Signal Processing Techniques and Applications (2001), Springer, Berlin, Germany
- (2001) Microphone Arrays: Signal Processing Techniques and Applications

5
- 33644639591
- Separation of speech by computational auditory scene analysis
- Benesty J., Makino S., and Chen J. (Eds), Springer, New York
- Brown G.J., and Wang D.L. Separation of speech by computational auditory scene analysis. In: Benesty J., Makino S., and Chen J. (Eds). Speech Enhancement (2005), Springer, New York 371-402
- (2005) Speech Enhancement , pp. 371-402
- Brown, G.J.¹ Wang, D.L.²

6
- 0032187518
- Blind signal separation: statistical principles
- Cardoso J.F. Blind signal separation: statistical principles. Proc. IEEE 86 10 (1998) 2009-2025
- (1998) Proc. IEEE , vol.86 , Issue.10 , pp. 2009-2025
- Cardoso, J.F.¹

7
- 33646780873
- Chen, C.-P., Bilmes, J., Ellis, D.P.W., 2005. Speech feature smoothing for robust ASR. In: Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing '05, vol. 1, pp. 525-528.

8
- 33750306752
- Cole, R., Noel, M., Lander, T., Durham, T., 1995. New telephone speech corpora at CSLU. In: Proc. European Conference on Speech Communication and Technology '95, pp. 821-824.

9
- 0035342414
- Robust automatic speech recognition with missing and unreliable acoustic data
- Cooke M., Green P., Josifovski L., and Vizinho A. Robust automatic speech recognition with missing and unreliable acoustic data. Speech Commun. 34 (2001) 267-285
- (2001) Speech Commun. , vol.34 , pp. 267-285
- Cooke, M.¹ Green, P.² Josifovski, L.³ Vizinho, A.⁴

10
- 33750295942
- Cunningham, S., Cooke, M., 1999. The role of evidence and counter-evidence in speech perception. In: Proc. International Congress on Phonetic Sciences '99, pp. 215-218.

11
- 0019053271
- Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences
- Davis S.B., and Mermelstein P. Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences. IEEE Trans. Acoust. Speech Signal Processing ASSP-28 4 (1980) 357-366
- (1980) IEEE Trans. Acoust. Speech Signal Processing , vol.ASSP-28 , Issue.4 , pp. 357-366
- Davis, S.B.¹ Mermelstein, P.²

12
- 33750302993
- de Veth, J., de Wet, F., Cranen, B., Boves, L., 1999. Missing feature theory in ASR: make sure you miss the right type of features. In: Proc. Workshop on Robust Methods for Speech Recognition in Adverse Conditions '99, pp. 231-234.

13
- 85009211607
- Droppo, J., Acero, A., Deng, L., 2002. A nonlinear observation model for removing noise from corrupted speech log mel-spectral energies. In: Proc. International Conference on Spoken Language Processing '02, pp. 1569-1572.

14
- 0031258231
- Blind separation of convolutive mixtures and an application in automatic speech recognition in a noisy environment
- Ehlers F., and Schuster H.G. Blind separation of convolutive mixtures and an application in automatic speech recognition in a noisy environment. IEEE Trans. Signal Processing 45 10 (1997) 2608-2612
- (1997) IEEE Trans. Signal Processing , vol.45 , Issue.10 , pp. 2608-2612
- Ehlers, F.¹ Schuster, H.G.²

15
- 0026843273
- A Bayesian estimation approach for speech enhancement using hidden Markov models
- Ephraim Y. A Bayesian estimation approach for speech enhancement using hidden Markov models. IEEE Trans. Signal Processing 40 4 (1992) 725-735
- (1992) IEEE Trans. Signal Processing , vol.40 , Issue.4 , pp. 725-735
- Ephraim, Y.¹

16
- 0021645331
- Speech enhancement using a minimum mean-square error short-time spectral amplitude estimator
- Ephraim Y., and Malah D. Speech enhancement using a minimum mean-square error short-time spectral amplitude estimator. IEEE Trans. Acoust. Speech Signal Processing ASSP-32 6 (1984) 1109-1121
- (1984) IEEE Trans. Acoust. Speech Signal Processing , vol.ASSP-32 , Issue.6 , pp. 1109-1121
- Ephraim, Y.¹ Malah, D.²

17
- 0030245128
- Robust continuous speech recognition using parallel model combination
- Gales M.J.F., and Young S.J. Robust continuous speech recognition using parallel model combination. IEEE Trans. Speech Audio Processing 4 (1996) 352-359
- (1996) IEEE Trans. Speech Audio Processing , vol.4 , pp. 352-359
- Gales, M.J.F.¹ Young, S.J.²

18
- 33750283214
- Gardner, W.G., Martin, K.D., 1994. HRTF measurements of a KEMAR dummy-head microphone. Technical Report #280, MIT Media Lab Perceptual Computing Group.

19
- 33750341537
- Garofolo, J., Lamel, L., Fisher, W., Fiscus, J., Pallet, D., Dahlgren, N., 1993. Darpa timit acoustic-phonetic continuous speech corpus. Technical Report NISTIR 4930, National Institute of Standards and Technology, Gaithersburg, MD.

20
- 33750364293
- Glotin, H., Berthommier, F., Tessier, E., 1999. A CASA-labelling model using the localisation cue for robust cocktail-party speech recognition. In: Proc. European Conference on Speech Communication and Technology '99, pp. 2351-2354.

21
- 0029288202
- Speech recognition in noisy environments: a survey
- Gong Y. Speech recognition in noisy environments: a survey. Speech Commun. 16 (1995) 261-291
- (1995) Speech Commun. , vol.16 , pp. 261-291
- Gong, Y.¹

22
- 0032677010
- Performance of an HMM speech recognizer using a real-time tracking microphone array as input
- Hughes T.B., Kim H.S., DiBase J.H., and Silverman H.F. Performance of an HMM speech recognizer using a real-time tracking microphone array as input. IEEE Trans. Speech Audio Processing 7 3 (1999) 346-349
- (1999) IEEE Trans. Speech Audio Processing , vol.7 , Issue.3 , pp. 346-349
- Hughes, T.B.¹ Kim, H.S.² DiBase, J.H.³ Silverman, H.F.⁴

23
- 0021226391
- Leonard, R.G., 1984. A database for speaker-independent digit recognition. In: Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing '84, pp. 111-114.

24
- 0031187171
- Speech recognition by machines and humans
- Lippmann R.P. Speech recognition by machines and humans. Speech Commun. 22 (1997) 1-15
- (1997) Speech Commun. , vol.22 , pp. 1-15
- Lippmann, R.P.¹

25
- 85101444608
- Wiley, New York, NY
- Little R.J.A., and Rubin D.B. Statistical Analysis with Missing Data (1987), Wiley, New York, NY
- (1987) Statistical Analysis with Missing Data
- Little, R.J.A.¹ Rubin, D.B.²

26
- 85009242725
- Macho, D., Mauuary, L., Noe, B., Cheng, Y.M., Ealey, D., Jouvet, D., Kelleher, H., Pearce, D., Saadoun, F., 2002. Evaluation of a noise-robust DSR front-end on aurora databases. In: Proc. International Conference on Spoken Language Processing '02, pp. 17-20.

27
- 0019009880
- Speech enhancement using a soft-decision noise suppression filter
- McAulay R., and Malpass M.L. Speech enhancement using a soft-decision noise suppression filter. IEEE Trans. Acoust. Speech Signal Processing ASSP-28 2 (1980) 137-145
- (1980) IEEE Trans. Acoust. Speech Signal Processing , vol.ASSP-28 , Issue.2 , pp. 137-145
- McAulay, R.¹ Malpass, M.L.²

28
- 0003513556
- Discrete-time Signal Processing
- Prentice-Hall, Upper Saddle River, NJ
- Oppenheim A.V., Schafer R.W., and Buck J.R. Discrete-time Signal Processing. second ed. (1999), Prentice-Hall, Upper Saddle River, NJ
- (1999) second ed.
- Oppenheim, A.V.¹ Schafer, R.W.² Buck, J.R.³

29
- 4644304197
- A binaural processor for missing data speech recognition in the presence of noise and small-room reverberation
- Palomaki K.J., Brown G.J., and Wang D.L. A binaural processor for missing data speech recognition in the presence of noise and small-room reverberation. Speech Commun. 43 (2004) 361-378
- (2004) Speech Commun. , vol.43 , pp. 361-378
- Palomaki, K.J.¹ Brown, G.J.² Wang, D.L.³

30
- 0023776398
- Price, P., Fisher, W.M., Bernstein, J., Pallett, D.S., 1988. The DARPA 1000 word Resource Management database for continuous speech recognition. In: Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing '88, pp. 651-654.

31
- 0004244302
- Fundamentals of Speech Recognition
- Prentice-Hall, Englewood Cliffs, NJ
- Rabiner L.R., and Juang B.H. Fundamentals of Speech Recognition. second ed. (1993), Prentice-Hall, Englewood Cliffs, NJ
- (1993) second ed.
- Rabiner, L.R.¹ Juang, B.H.²

32
- 4644336054
- Reconstruction of missing features for robust speech recognition
- Raj B., Seltzer M.L., and Stern R.M. Reconstruction of missing features for robust speech recognition. Speech Commun. 43 (2004) 275-296
- (2004) Speech Commun. , vol.43 , pp. 275-296
- Raj, B.¹ Seltzer, M.L.² Stern, R.M.³

33
- 0142026377
- Speech segregation based on sound localization
- Roman N., Wang D.L., and Brown G.J. Speech segregation based on sound localization. J. Acoust. Soc. Am. 114 (2003) 2236-2252
- (2003) J. Acoust. Soc. Am. , vol.114 , pp. 2236-2252
- Roman, N.¹ Wang, D.L.² Brown, G.J.³

34
- 0003444613
- Rosenthal D.F., and Okuno H.G. (Eds), Lawrence Erlbaum Associates, Mahwah, NJ
- In: Rosenthal D.F., and Okuno H.G. (Eds). Computational Auditory Scene Analysis (1998), Lawrence Erlbaum Associates, Mahwah, NJ
- (1998) Computational Auditory Scene Analysis

35
- 33750353915
- Shire, M.L., 2000. Discriminant training of front-end and acoustic modeling stages to heterogeneous acoustic environments for multi-stream automatic speech recognition. Ph.D. thesis, University of California, Berkley.

36
- 85009151411
- Srinivasan, S., Roman, N., Wang, D.L., 2004. On binary and ratio time-frequency masks for robust speech recognition. In: Proc. International Conference on Spoken Language Processing '04, pp. 2541-2544.

37
- 33750365068
- STQ-AURORA, 2005-11. Speech processing, transmission and quality aspects (STQ); distributed speech recognition; advanced front-end feature extraction algorithm; compression algorithms. In: ETSI ES 202 050 V1.1.4. European Telecommunications Standards Institute.

38
- 33750285581
- Tessier, E., Berthommier, F., Glotin, H., Choi, S., 1999. A CASA front-end using the localisation cue for segregation and then cocktail-party speech recognition. In: Proc. IEEE International Conference on Speech Processing, pp. 97-102.

39
- 85009212472
- van Hamme, H., 2003. Robust speech recognition using missing feature theory in the cepstral or LDA domain. In: Proceedings of the European Conference on Speech Communication and Technology '03, pp. 3089-3092.

40
- 0003462953
- Wiley, New York, NY
- van Trees H.L. Detection, Estimation, and Modulation Theory, Part I (1968), Wiley, New York, NY
- (1968) Detection, Estimation, and Modulation Theory, Part I
- van Trees, H.L.¹

41
- 0025681008
- Varga, A.P., Moore, R.K., 1990. Hidden Markov model decomposition of speech and noise. In: Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing '90, pp. 845-848.

42
- 33750297707
- Varga, A.P., Steeneken, H.J.M., Tomlinson, M., Jones, D., 1992. The NOISEX-92 study on the effect of additive noise on automatic speech recognition. Technical Report, Speech Research Unit, Defense Research Agency, Malvern, UK.

43
- 0003571976
- Microsoft Corporation
- Young S., Kershaw D., Odell J., Valtchev V., and Woodland P. The HTK Book (for HTK Version 3.0) (2000), Microsoft Corporation
- (2000) The HTK Book (for HTK Version 3.0)
- Young, S.¹ Kershaw, D.² Odell, J.³ Valtchev, V.⁴ Woodland, P.⁵

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.