메뉴 건너뛰기




Volumn 128, Issue 6, 2010, Pages 3769-3780

Temporal envelope compensation for robust phoneme recognition using modulation spectrum

Author keywords

[No Author keywords available]

Indexed keywords

ADAPTIVE LOOPS; ANALYSIS TECHNIQUES; CHANNEL NOISE; CONVERSATIONAL TELEPHONE SPEECH; FEATURE EXTRACTION TECHNIQUES; FREQUENCY DOMAINS; LINEAR PREDICTION; MODULATION FREQUENCIES; MODULATION SPECTRUM; NOISE COMPENSATION; NOISY SPEECH; PHONEME RECOGNITION; PROCESSING STAGE; ROBUST SPEECH; SPEECH SIGNALS; SUB-BANDS; TEMPORAL ENVELOPES; TEST DATA;

EID: 79952171347     PISSN: 00014966     EISSN: None     Source Type: Journal    
DOI: 10.1121/1.3504658     Document Type: Article
Times cited : (29)

References (40)
  • 1
    • 36248966385 scopus 로고    scopus 로고
    • Autoregressive modelling of temporal envelopes
    • Athineos, M., and Ellis, D. P. W. (2007). "Autoregressive modelling of temporal envelopes," IEEE Trans. Signal Process. 55(11), 5237-5245.
    • (2007) IEEE Trans. Signal Process. , vol.55 , Issue.11 , pp. 5237-5245
    • Athineos, M.1    Ellis, D.P.W.2
  • 3
    • 0031192532 scopus 로고    scopus 로고
    • On the effects of short-term spectrum smoothing in channel normalization
    • PII S1063667697048591
    • Avendano, C., and Hermansky, H. (1997). "On the effects of short-term spectrum smoothing in channel normalization," IEEE Trans. Speech Audio Process. 5(4), 372-374. (Pubitemid 127746010)
    • (1997) IEEE Transactions on Speech and Audio Processing , vol.5 , Issue.4 , pp. 372-374
    • Avendano, C.1    Hermansky, H.2
  • 6
    • 0029952425 scopus 로고    scopus 로고
    • A quantitative model of the 'effective' signal processing in the auditory system: I. Model structure
    • Dau, T., Püschel, D., and Kohlrausch, A. (1996). "A quantitative model of the 'effective' signal processing in the auditory system: I. Model structure," J. Acoust. Soc. Am. 99(6), 3615-3622.
    • (1996) J. Acoust. Soc. Am. , vol.99 , Issue.6 , pp. 3615-3622
    • Dau, T.1    Püschel, D.2    Kohlrausch, A.3
  • 10
    • 70450185608 scopus 로고    scopus 로고
    • Noise suppression based on extending a speech-dominated modulation band
    • Falk, T. H., Stadler, S., Kleijn, W. B., and Chan, W. Y. (2007). "Noise suppression based on extending a speech-dominated modulation band," in Proceedings of Interspeech, pp. 970-973.
    • (2007) Proceedings of Interspeech , pp. 970-973
    • Falk, T.H.1    Stadler, S.2    Kleijn, W.B.3    Chan, W.Y.4
  • 11
    • 58649102246 scopus 로고    scopus 로고
    • Modulation frequency features for phoneme recognition in noisy speech
    • Ganapathy, S., Thomas, S., and Hermansky, H. (2009). "Modulation frequency features for phoneme recognition in noisy speech," J. Acoust. Soc. Am., Express Lett. 125(1), EL8-EL12.
    • (2009) J. Acoust. Soc. Am., Express Lett. , vol.125 , Issue.1
    • Ganapathy, S.1    Thomas, S.2    Hermansky, H.3
  • 14
    • 85009252959 scopus 로고    scopus 로고
    • Double the trouble: Handling noise and reverberation in far-field automatic speech recognition
    • Gelbart, D., and Morgan, N. (2002). "Double the trouble: Handling noise and reverberation in far-field automatic speech recognition," in Proceedings of Interspeech, pp. 2185-2188.
    • (2002) Proceedings of Interspeech , pp. 2185-2188
    • Gelbart, D.1    Morgan, N.2
  • 16
    • 0025041264 scopus 로고
    • Perceptual linear predictive (PLP) analysis of speech
    • Hermansky, H. (1990). "Perceptual linear predictive (PLP) analysis of speech," J. Acoust. Soc. Am. 87(4), 1738-1752.
    • (1990) J. Acoust. Soc. Am. , vol.87 , Issue.4 , pp. 1738-1752
    • Hermansky, H.1
  • 17
    • 33745213373 scopus 로고    scopus 로고
    • Multi-resolution RASTA filtering for TANDEM-based ASR
    • Hermansky, H., and Fousek, P. (2005). "Multi-resolution RASTA filtering for TANDEM-based ASR," in Proceedings of Interspeech, pp. 361-364.
    • (2005) Proceedings of Interspeech , pp. 361-364
    • Hermansky, H.1    Fousek, P.2
  • 19
    • 0003235731 scopus 로고    scopus 로고
    • TRAPS - Classifiers of Temporal Patterns
    • Hermansky, H., and Sharma, S. (1998). "TRAPS - Classifiers of Temporal Patterns," in Proceedings of Interspeech, pp. 1817-1820.
    • (1998) Proceedings of Interspeech , pp. 1817-1820
    • Hermansky, H.1    Sharma, S.2
  • 21
    • 33745206705 scopus 로고    scopus 로고
    • The simulation of realistic acoustic input scenarios for speech recognition systems
    • Hirsch, H. G., and Finster, H. (2005). "The simulation of realistic acoustic input scenarios for speech recognition systems," in Proceedings of Interspeech, pp. 2697-3000.
    • (2005) Proceedings of Interspeech , pp. 2697-3000
    • Hirsch, H.G.1    Finster, H.2
  • 22
    • 0019060580 scopus 로고
    • PREDICTING SPEECH INTELLIGIBILITY in ROOMS from the MODULATION TRANSFER FUNCTION - 1. GENERAL ROOM ACOUSTICS
    • Houtgast, T., Steeneken, H. J. M., and Plomp, R. (1980). "Predicting speech intelligibility in rooms from the modulation transfer function, I. General room acoustics," Acoustica 46, 60-72. (Pubitemid 11477041)
    • (1980) Acustica , vol.46 , Issue.1 , pp. 60-72
    • Houtgast, T.1    Steeneken, H.J.M.2    Plomp, R.3
  • 23
    • 0032136330 scopus 로고    scopus 로고
    • Robust speech recognition using the modulation spectrogram
    • PII S0167639398000326
    • Kingsbury, B. E. D., Morgan, N., and Greenberg, S. (1998). "Robust speech recognition using the modulation spectrogram," Speech Commun. 25(1-3), 117-132. (Pubitemid 128413637)
    • (1998) Speech Communication , vol.25 , Issue.1-3 , pp. 117-132
    • Kingsbury, B.E.D.1    Morgan, N.2    Greenberg, S.3
  • 24
    • 0033004349 scopus 로고    scopus 로고
    • Model-based approach to envelope and positive instantaneous frequency estimation of signals with speech applications
    • Kumerasan, R., and Rao, A. (1999). "Model-based approach to envelope and positive instantaneous frequency estimation of signals with speech applications," J. Acoust. Soc. Am. 105(3), 1912-1924.
    • (1999) J. Acoust. Soc. Am. , vol.105 , Issue.3 , pp. 1912-1924
    • Kumerasan, R.1    Rao, A.2
  • 25
    • 0024768209 scopus 로고
    • Speaker independent phone recognition using hiddenMarkov models
    • Lee, K. F. (1989). "Speaker independent phone recognition using hiddenMarkov models," IEEE Trans. Acoust., Speech, Signal Process. 37(11), 1641-1648.
    • (1989) IEEE Trans. Acoust., Speech, Signal Process. , vol.37 , Issue.11 , pp. 1641-1648
    • Lee, K.F.1
  • 26
    • 0016495091 scopus 로고
    • 'Linear prediction: A tutorial review
    • Makhoul, J. (1975). "'Linear prediction: A tutorial review," Proc. IEEE 63(4), 561-580.
    • (1975) Proc. IEEE , vol.63 , Issue.4 , pp. 561-580
    • Makhoul, J.1
  • 27
    • 0032634932 scopus 로고    scopus 로고
    • Computing the discrete-time analytic signal via FFT
    • Marple, L. S. (1999). "Computing the discrete-time analytic signal via FFT," IEEE Trans. Signal Process. 47(9), 2600-2603.
    • (1999) IEEE Trans. Signal Process. , vol.47 , Issue.9 , pp. 2600-2603
    • Marple, L.S.1
  • 30
    • 0003200767 scopus 로고    scopus 로고
    • The AURORA experimental framework for the performance evaluations of speech recognition systems under noisy conditions
    • Pearce, D., and Hirsch, H. G. (2000). "The AURORA experimental framework for the performance evaluations of speech recognition systems under noisy conditions," in ISCA Tutorial and Research Workshop ASR2000, pp. 29-32.
    • (2000) ISCA Tutorial and Research Workshop ASR2000 , pp. 29-32
    • Pearce, D.1    Hirsch, H.G.2
  • 33
    • 0001613977 scopus 로고
    • Differential sensitivity of the ear for pure tones
    • Riesz, R. R. (1928). "Differential sensitivity of the ear for pure tones," Phys. Rev. 31, 867-875.
    • (1928) Phys. Rev. , vol.31 , pp. 867-875
    • Riesz, R.R.1
  • 35
    • 0028823541 scopus 로고
    • Speech recognition with primarily temporal cues
    • Shannon, R. V., Zeng, F. G., Kamath, V., Wygonski, J., and Ekelid, M. (1995). "Speech recognition with primarily temporal cues," Science 270(5234), 303-304.
    • (1995) Science , vol.270 , Issue.5234 , pp. 303-304
    • Shannon, R.V.1    Zeng, F.G.2    Kamath, V.3    Wygonski, J.4    Ekelid, M.5
  • 37
    • 0032828464 scopus 로고    scopus 로고
    • A model of auditory perception as front end for automatic speech recognition
    • Tchorz, J., and Kollmeier, B. (1999). "A model of auditory perception as front end for automatic speech recognition," J. Acoust. Soc. Am. 106(4), 2040-2050.
    • (1999) J. Acoust. Soc. Am. , vol.106 , Issue.4 , pp. 2040-2050
    • Tchorz, J.1    Kollmeier, B.2
  • 38
    • 67650107416 scopus 로고    scopus 로고
    • Recognition of reverberant speech using frequency domain linear prediction
    • Thomas, S., Ganapathy, S., and Hermansky, H. (2008). "Recognition of reverberant speech using frequency domain linear prediction," IEEE Signal Process. Lett. 15, 681-684.
    • (2008) IEEE Signal Process. Lett. , vol.15 , pp. 681-684
    • Thomas, S.1    Ganapathy, S.2    Hermansky, H.3
  • 40
    • 35248862134 scopus 로고    scopus 로고
    • Spectral and temporal cues for phoneme recognition in noise
    • DOI 10.1121/1.2767000
    • Xu, L., and Zheng, Y. (2007). "Spectral and temporal cues for phoneme recognition in noise," J. Acoust. Soc. Am. 122(3), 1758-1764. (Pubitemid 47560537)
    • (2007) Journal of the Acoustical Society of America , vol.122 , Issue.3 , pp. 1758-1764
    • Xu, L.1    Zheng, Y.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.