SCOPUS 정보 검색 플랫폼

Volumn 25, Issue 3, 2011, Pages 571-584

Sub-band temporal modulation envelopes and their normalization for automatic speech recognition in reverberant environments

(3) Lu, Xugang a Unoki, Masashi b Nakamura, Satoshi a

a NATIONAL INSTITUTE OF INFORMATION AND COMMUNICATIONS TECHNOLOGY (Japan)

b JAPAN ADVANCED INSTITUTE OF SCIENCE AND TECHNOLOGY (Japan)

Author keywords

Automatic speech recognition; Speech reverberation; Sub band temporal modulation envelope; Temporal modulation

Indexed keywords

AUTOMATIC SPEECH RECOGNITION; BAND PASS; CONSTANT BANDWIDTH; EXTRACTION METHOD; HILBERT TRANSFORM; INVERSE FOURIER TRANSFORMS; LOW-PASS FILTERING; MEAN AND VARIANCE NORMALIZATIONS; MEL-FREQUENCY CEPSTRAL COEFFICIENTS; MODULATION SPECTRUM; MODULATION TRANSFER; MODULATION TRANSFER FUNCTION; NORMALIZATION METHODS; PHASE INFORMATION; RECOGNITION PERFORMANCE; RECORDING ENVIRONMENT; REVERBERANT ENVIRONMENT; REVERBERANT ROOM; SUB-BAND TEMPORAL MODULATION ENVELOPE; SUB-BANDS; TEMPORAL FILTERING; TEMPORAL MODULATION; TEMPORAL MODULATIONS;

BANDPASS FILTERS; FOURIER TRANSFORMS; LOW PASS FILTERS; MODULATION; REVERBERATION;

SPEECH RECOGNITION;

EID: 79952625580 PISSN: 08852308 EISSN: 10958363 Source Type: Journal
DOI: 10.1016/j.csl.2010.10.002 Document Type: Article

Times cited : (10)

References (39)

1
- 0017659025
- Multimicrophone signal-processing technique to remove room reverberation from speech signals
- DOI 10.1121/1.381621
- J.B. Allen, D.A. Berkley, and J. Blauert Multi-microphone signal-processing technique to remove room reverberation from speech signals Journal of the Acoustical Society of America 62 4 1977 912 915 (Pubitemid 8199278)
- (1977) Journal of the Acoustical Society of America , vol.62 , Issue.4 , pp. 912-915
- Allen, J.B.¹ Berkley, D.A.² Blauert, J.³

2
- 79952626638
- http://sp.shinshu-u.ac.jp/CENSREC/. AURORA-2J database.

3
- 0031347666
- On the properties of temporal processing for speech in adverse environments
- Mohonk Mountain House New Paltz, New York
- C. Avendano, and H. Hermansky On the properties of temporal processing for speech in adverse environments Workshop on Applications of Signal Processing to Audio and Acoustics 1997 Mohonk Mountain House New Paltz, New York
- (1997) Workshop on Applications of Signal Processing to Audio and Acoustics
- Avendano, C.¹ Hermansky, H.²

4
- 0018455310
- Suppression of acoustic noise in speech using spectral subtraction
- S.F. Boll Suppression of acoustic noise in speech using spectral subtraction IEEE Transactions on Acoustics, Speech, and Signal Processing ASSP-27 1979 113 120
- (1979) IEEE Transactions on Acoustics, Speech, and Signal Processing , vol.27 , pp. 113-120
- Boll, S.F.¹

5
- 0003980102
- 1st ed. Springer-Verlag Berlin
- M.S. Brandstein, D.B. Ward, Microphone Arrays: Signal Processing Techniques and Applications 1st ed. 2000 Springer-Verlag Berlin
- (2000) Microphone Arrays: Signal Processing Techniques and Applications
- Brandstein, M.S.¹ Ward, D.B.²

6
- 42549139762
- MVA processing of speech features
- C.P. Chen, and J. Bilmes MVA processing of speech features IEEE Transactions on Audio, Speech, and Language Processing 15 1 2007 257 270
- (2007) IEEE Transactions on Audio, Speech, and Language Processing , vol.15 , Issue.1 , pp. 257-270
- Chen, C.P.¹ Bilmes, J.²

7
- 0028287770
- Effect of reducing slow temporal modulations on speech reception
- DOI 10.1121/1.409836
- R. Drullman, J.M. Festen, and R. Plomp Effects of reducing slow temporal modulations on speech reception Journal of the Acoustical Society of America 95 5 1994 2670 2680 (Pubitemid 24152861)
- (1994) Journal of the Acoustical Society of America , vol.95 , Issue.5 , pp. 2670-2680
- Drullman, R.¹ Festen, J.M.² Plomp, R.³

8
- 0021892216
- Speech enhancement using a minimum mean square error log-spectral amplitude estimator
- Y. Ephraim, and D. Malah Speech enhancement using a minimum mean square error log-spectral amplitude estimator IEEE Transactions on Acoustics, Speech and Signal Processing 33 2 1985 443 445
- (1985) IEEE Transactions on Acoustics, Speech and Signal Processing , vol.33 , Issue.2 , pp. 443-445
- Ephraim, Y.¹ Malah, D.²

9
- 0003515694
- Marcel Dekker, Inc. New York
- S. Furui, and M.M. Sondhi Advances in Speech Signal Processing 1991 Marcel Dekker, Inc. New York
- (1991) Advances in Speech Signal Processing
- Furui, S.¹ Sondhi, M.M.²

10
- 0028517164
- RASTA processing of speech
- H. Hermansky, and N. Morgan RASTA processing of speech IEEE Transactions on Audio, Speech, and Language Processing 2 4 1994 578 589
- (1994) IEEE Transactions on Audio, Speech, and Language Processing , vol.2 , Issue.4 , pp. 578-589
- Hermansky, H.¹ Morgan, N.²

11
- 0027166410
- Recognition of speech in additive and convolutional noise based on RASTA spectral processing
- H. Hermansky, N. Morgan, and H.G. Hirsch Recognition of speech in additive and convolutional noise based on RASTA spectral processing Proc. ICASSP'93 1993 83 86
- (1993) Proc. ICASSP'93 , pp. 83-86
- Hermansky, H.¹ Morgan, N.² Hirsch, H.G.³

12
- 0141587024
- Speech waveform recovery from a reverberant speech signal using inverse filtering of the power envelope transfer function
- S. Hirobayashi, H. Nomura, T. Koike, and M. Tohyama Speech waveform recovery from a reverberant speech signal using inverse filtering of the power envelope transfer function. IEICE Transactions A J81-A 10 1998 1323 1330
- (1998) IEICE Transactions A , vol.81 , Issue.10 , pp. 1323-1330
- Hirobayashi, S.¹ Nomura, H.² Koike, T.³ Tohyama, M.⁴

13
- 4344705227
- Validation of blind dereverberation using power envelope inverse filtering and filter banks
- S. Hirobayashi, and T. Yamabuchi Validation of blind dereverberation using power envelope inverse filtering and filter banks IEICE Transactions A J83-A 8 2000 1029 1033
- (2000) IEICE Transactions A , vol.83 , Issue.8 , pp. 1029-1033
- Hirobayashi, S.¹ Yamabuchi, T.²

14
- 0015553712
- The modulation transfer function in room acoustics as a predictor of speech intelligibility
- T. Houtgast, and H.J.M. Steeneken The modulation transfer function in room acoustics as a predictor of speech intelligibility Acustica 28 1973 66 73
- (1973) Acustica , vol.28 , pp. 66-73
- Houtgast, T.¹ Steeneken, H.J.M.²

15
- 84873312246
- A review of the MTF concept in room acoustics and its use for estimating speech intellgibility in auditoria
- T. Houtgast, and H.J.M. Steeneken A review of the MTF concept in room acoustics and its use for estimating speech intellgibility in auditoria Journal of the Acoustical Society of America 77 3 1985 1069 1077
- (1985) Journal of the Acoustical Society of America , vol.77 , Issue.3 , pp. 1069-1077
- Houtgast, T.¹ Steeneken, H.J.M.²

16
- 0032676337
- On the relative importance of various components of the modulation spectrum for automatic speech recognition
- N. Kanedera, T. Arai, H. Hermansky, and M. Pavel On the relative importance of various components of the modulation spectrum for automatic speech recognition Speech Communication 28 1 1999 43 55
- (1999) Speech Communication , vol.28 , Issue.1 , pp. 43-55
- Kanedera, N.¹ Arai, T.² Hermansky, H.³ Pavel, M.⁴

17
- 33947694356
- Spectral subtraction steered by multi-step forward linear prediction for single channel speech dereverberation
- K. Kinoshita, T. Nakatani, and M. Miyoshi Spectral subtraction steered by multi-step forward linear prediction for single channel speech dereverberation Proc. ICASSP'06, I 2006 817 820
- (2006) Proc. ICASSP'06, i , pp. 817-820
- Kinoshita, K.¹ Nakatani, T.² Miyoshi, M.³

18
- 0005093211
- Efficient cepstral normalization for robust speech recognition
- Liu, F.; Stern, R.; Huang, X.; Acero, A.; 1993. Efficient cepstral normalization for robust speech recognition. In: Proceedings of ARPA Human Language Technology Workshop.
- (1993) Proceedings of ARPA Human Language Technology Workshop
- Liu, F.¹ Stern, R.² Huang, X.³ Acero, A.⁴

19
- 44949119574
- A robust feature extraction based on the MTF concept for speech recognition in reverberant environment
- X. Lu, M. Unoki, and M. Akagi A robust feature extraction based on the MTF concept for speech recognition in reverberant environment Proc. ICSLP'06 2006 2546 2549
- (2006) Proc. ICSLP'06 , pp. 2546-2549
- Lu, X.¹ Unoki, M.² Akagi, M.³

20
- 56549098616
- Comparative evaluation of modulation-transfer-function-based blind restoration of sub-band power envelopes of speech as a front-end processor for automatic speech recognition systems
- X. Lu, M. Unoki, and M. Akagi Comparative evaluation of modulation-transfer-function-based blind restoration of sub-band power envelopes of speech as a front-end processor for automatic speech recognition systems Acoustical Science and Technology 29 6 2008 351 361
- (2008) Acoustical Science and Technology , vol.29 , Issue.6 , pp. 351-361
- Lu, X.¹ Unoki, M.² Akagi, M.³

21
- 84867218794
- Effect of compressing the dynamic range of the power spectrum in modulation filtering based speech enhancement
- J.G. Lyons, and K.K. Paliwal Effect of compressing the dynamic range of the power spectrum in modulation filtering based speech enhancement Proc. INTERSPEECH'08 2008 387 390
- (2008) Proc. INTERSPEECH'08 , pp. 387-390
- Lyons, J.G.¹ Paliwal, K.K.²

22
- 0023961145
- Inverse filtering of room acoustics
- M. Miyoshi, and Y. Kaneda Inverse filtering of room acoustics IEEE Transactions on Acoustics, Speech, and Signal Processing ASSP-36 1988 145 152
- (1988) IEEE Transactions on Acoustics, Speech, and Signal Processing , vol.36 , pp. 145-152
- Miyoshi, M.¹ Kaneda, Y.²

23
- 0038238630
- A survey on automatic speech recognition
- S. Nakagawa A survey on automatic speech recognition IEICE Transactions D-II J83-D-II 2 2000 433 457
- (2000) IEICE Transactions D-II , vol.J83-D-II , Issue.2 , pp. 433-457
- Nakagawa, S.¹

24
- 34548571735
- Harmonicity-based blind dereverberation for single-channel speech signals
- T. Nakatani, K. Kinoshita, and M. Miyoshi Harmonicity-based blind dereverberation for single-channel speech signals IEEE Transactions on Audio, Speech, and Language Processing 15 1 2007 80 95
- (2007) IEEE Transactions on Audio, Speech, and Language Processing , vol.15 , Issue.1 , pp. 80-95
- Nakatani, T.¹ Kinoshita, K.² Miyoshi, M.³

25
- 0141830958
- Blind dereverberation of single channel speech signal based on harmonic structure
- T. Nakatani, and M. Miyoshi Blind dereverberation of single channel speech signal based on harmonic structure Proc. ICASSP'03, 1 2003 92 95
- (2003) Proc. ICASSP'03, 1 , pp. 92-95
- Nakatani, T.¹ Miyoshi, M.²

26
- 70450139534
- Blind dereverberation of monaural speech signals based on harmonic structure
- Nakatani, T.; Miyoshi, M.; Kinoshita, K.; 2005. Blind dereverberation of monaural speech signals based on harmonic structure. IEICE D-II, J88-D-II (3), 509-520.
- (2005) IEICE D-II, J88-D-II , Issue.3 , pp. 509-520
- Nakatani, T.¹ Miyoshi, M.² Kinoshita, K.³

27
- 79751529584
- Springer London, ISSN (print)
- Neumann, J.; Gasas, J.R.; Macho, D.; Hidalgo, J.R.; 2007. Integration of audio-visual sensors and technologies in a smart room. Personal and Ubiquitous Computing. Springer London, ISSN: 1617-4909 (print).
- (2007) Integration of Audio-visual Sensors and Technologies in A Smart Room. Personal and Ubiquitous Computing , pp. 1617-4909
- Neumann, J.¹ Gasas ., J.R.² MacHo, D.³ Hidalgo ., J.R.⁴

28
- 0019635319
- Modulation transfer function: definition and measurement
- M.R. Schroeder Modulation transfer function: definition and measurement Acustica 49 1981 179 182 (Pubitemid 12508801)
- (1981) Acustica , vol.49 , Issue.3 , pp. 179-182
- Schroeder, M.R.¹

29
- 33744731363
- A comparative study of filter bank spacing for speech recognition
- B.J. Shannon, and K.K. Paliwal A comparative study of filter bank spacing for speech recognition Microelectronic Engineering Research Conference 2003 1 3
- (2003) Microelectronic Engineering Research Conference , pp. 1-3
- Shannon, B.J.¹ Paliwal, K.K.²

30
- 0028823541
- Speech recognition with primarily temporal cues
- R.V. Shannon, F. Zeng, V. Kamath, J. Wygonski, and M. Ekelid Speech recognition with primarily temporal cues Science 270 1995 303 304
- (1995) Science , vol.270 , pp. 303-304
- Shannon, R.V.¹ Zeng, F.² Kamath, V.³ Wygonski, J.⁴ Ekelid, M.⁵

31
- 0010516808
- Hands-free speech recognition by HMM composition in noisy reverberant environments
- T. Takiguchi, S. Nakamura, and K. Shikano Hands-free speech recognition by HMM composition in noisy reverberant environments IEICE Transactions D-II J79-D-II 12 1996 2047 2053
- (1996) IEICE Transactions D-II , vol.J79-D-II , Issue.12 , pp. 2047-2053
- Takiguchi, T.¹ Nakamura, S.² Shikano, K.³

32
- 0003822743
- (version 3.2), 2002. Cambridge University Engineering Department
- The HTK Book (version 3.2), 2002. Cambridge University Engineering Department.
- The HTK Book

33
- 4344685385
- An improved method based on the MTF concept for restoring the power envelope from a reverberant signal
- M. Unoki, M. Furukawa, K. Sakata, and M. Akagi An improved method based on the MTF concept for restoring the power envelope from a reverberant signal Acoustical Science and Technology 25 4 2004 232 242
- (2004) Acoustical Science and Technology , vol.25 , Issue.4 , pp. 232-242
- Unoki, M.¹ Furukawa, M.² Sakata, K.³ Akagi, M.⁴

34
- 51449100217
- Comparative evaluations of robust and accurate F0 estimates in reverberant environments
- M. Unoki, T. Hosorogiya, and Y. Ishimoto Comparative evaluations of robust and accurate F0 estimates in reverberant environments Proc. ICASSP'08 2008 4569 4572
- (2008) Proc. ICASSP'08 , pp. 4569-4572
- Unoki, M.¹ Hosorogiya, T.² Ishimoto, Y.³

35
- 4344573437
- A speech dereverberation method based on the MTF concept in power envelope restoration
- M. Unoki, K. Sakata, M. Furukawa, and M. Akagi A speech dereverberation method based on the MTF concept in power envelope restoration Acoustical Science and Technology 25 4 2004 243 254
- (2004) Acoustical Science and Technology , vol.25 , Issue.4 , pp. 243-254
- Unoki, M.¹ Sakata, K.² Furukawa, M.³ Akagi, M.⁴

36
- 84889779628
- 2nd ed. John Wiley & Sons, Ltd.
- S.V. Vaseghi Advanced Digital Signal Processing and Noise Reduction 2nd ed. 2000 John Wiley & Sons, Ltd.
- (2000) Advanced Digital Signal Processing and Noise Reduction
- Vaseghi, S.V.¹

37
- 0141957802
- Efficient alternatives to the Ephraim and Malah suppression rule for audio signal enhancement
- P.J. Wolfe, and S.J. Godsill Efficient alternatives to the Ephraim and Malah suppression rule for audio signal enhancement EURASIP Journal on Applied Signal Processing 10 2003 1043 1051
- (2003) EURASIP Journal on Applied Signal Processing , vol.10 , pp. 1043-1051
- Wolfe, P.J.¹ Godsill, S.J.²

38
- 34347376319
- Temporal structure normalization of speech feature for robust speech recognition
- DOI 10.1109/LSP.2006.891341
- X. Xiao, E.S. Chng, and H. Li Temporal structure normalization of speech feature for robust speech recognition IEEE Signal Processing Letters 14 7 2007 500 503 (Pubitemid 47018924)
- (2007) IEEE Signal Processing Letters , vol.14 , Issue.7 , pp. 500-503
- Xiao, X.¹ Chng, E.S.² Li, H.³

39
- 70350016998
- Normalization of speech modulation spectra for robust speech recognition
- X. Xiao, E.S. Chng, and H. Li Normalization of speech modulation spectra for robust speech recognition IEEE Transactions on Audio, Speech, and Language Processing 16 8 2008 1662 1674
- (2008) IEEE Transactions on Audio, Speech, and Language Processing , vol.16 , Issue.8 , pp. 1662-1674
- Xiao, X.¹ Chng, E.S.² Li, H.³

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.