SCOPUS 정보 검색 플랫폼

Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH

Volumn , Issue , 2013, Pages 872-875

Spectro-temporal directional derivative features for automatic speech recognition

(5) Gibson, James a Van Segbroeck, Maarten a Ortega, Antonio a Georgiou, Panayiotis a Narayanan, Shrikanth a

a UNIVERSITY OF SOUTHERN CALIFORNIA (United States)

Author keywords

Automatic speech recognition; Directional wavelet transforms; Spectro temporal features

Indexed keywords

IMAGE PROCESSING; WAVELET TRANSFORMS;

AUTOMATIC SPEECH RECOGNITION; DIRECTIONAL DERIVATIVE; DIRECTIONAL WAVELET TRANSFORM; FEATURE REPRESENTATION; ROBUST AUTOMATIC SPEECH RECOGNITION; SPECTRO-TEMPORAL FEATURES; VOICE ACTIVITY DETECTION; WAVELET TRANSFORMATIONS;

SPEECH RECOGNITION;

EID: 84906282217 PISSN: 2308457X EISSN: 19909772 Source Type: Conference Proceeding
DOI: None Document Type: Conference Paper

Times cited : (5)

References (25)

1
- 85009233038
- Improving word accuracy with gabor feature extraction
- M. Kleinschmidt and D. Gelbart, "Improving word accuracy with gabor feature extraction, " in Proc. ICSLP, vol. 5, 2002, pp. 16-38.
- (2002) Proc. ICSLP , vol.5 , pp. 16-38
- Kleinschmidt, M.¹ Gelbart, D.²

2
- 85009227802
- Localized spectro-temporal features for automatic speech recognition
- Citeseer
- M. Kleinschmidt, "Localized spectro-temporal features for automatic speech recognition, " in Proc. Eurospeech, vol. 87. Citeseer, 2003.
- (2003) Proc. Eurospeech , vol.87
- Kleinschmidt, M.¹

3
- 34547509128
- Representation of phonemes in primary auditory cortex: How the brain analyzes speech
- IV-765
- N. Mesgarani, S. David, and S. Shamma, "Representation of phonemes in primary auditory cortex: how the brain analyzes speech, " in Proc. ICASSP, vol. 4, 2007, pp. IV-765.
- (2007) Proc. ICASSP , vol.4
- Mesgarani, N.¹ David, S.² Shamma, S.³

4
- 0038711696
- A spectro-temporal modulation index (stmi) for assessment of speech intelligibility
- M. Elhilali, T. Chi, and S. A. Shamma, "A spectro-temporal modulation index (stmi) for assessment of speech intelligibility, " Speech communication, vol. 41, no. 2, pp. 331-348, 2003.
- (2003) Speech Communication , vol.41 , Issue.2 , pp. 331-348
- Elhilali, M.¹ Chi, T.² Shamma, S.A.³

5
- 34047272330
- Discrimination of speech from nonspeech based on multiscale spectro-temporal modulations
- N. Mesgarani, M. Slaney, and S. Shamma, "Discrimination of speech from nonspeech based on multiscale spectro-temporal modulations, " Audio, Speech, and Language Processing, IEEE Transactions on, vol. 14, no. 3, pp. 920-930, 2006.
- (2006) Audio, Speech, and Language Processing, IEEE Transactions on , vol.14 , Issue.3 , pp. 920-930
- Mesgarani, N.¹ Slaney, M.² Shamma, S.³

6
- 84865769808
- Comparing different flavors of spectro-temporal features for asr
- B. Meyer, S. Ravuri, M. Schädler, and N. Morgan, "Comparing different flavors of spectro-temporal features for asr, " in Proc. of Inter Speech, 2011, pp. 1269-1272.
- (2011) Proc. of Inter Speech , pp. 1269-1272
- Meyer, B.¹ Ravuri, S.² Schädler, M.³ Morgan, N.⁴

7
- 84890497049
- Hooking up spectro-temporal filters with auditory-inspired representations for robust automatic speech recognition
- B. Meyer, C. Spille, B. Kollmeier, and N. Morgan, "Hooking up spectro-temporal filters with auditory-inspired representations for robust automatic speech recognition, " in Proc. Inter Speech, vol. 15, 2012, p. 20.
- (2012) Proc. Inter Speech , vol.15 , pp. 20
- Meyer, B.¹ Spille, C.² Kollmeier, B.³ Morgan, N.⁴

8
- 84878611488
- Normalization of spectrotemporal gabor filter bank features for improved robust automatic speech recognition systems
- M. R. Schädler and B. Kollmeier, "Normalization of spectrotemporal gabor filter bank features for improved robust automatic speech recognition systems, " in Proc. Inter Speech, 2012.
- (2012) Proc. Inter Speech
- Schädler, M.R.¹ Kollmeier, B.²

9
- 84878395103
- Longer features: They do a speech detector good
- T. Tsai and N. Morgan, "Longer features: They do a speech detector good, " in Proc. Inter Speech, 2012.
- (2012) Proc. Inter Speech
- Tsai, T.¹ Morgan, N.²

10
- 84863799482
- Spectro-temporal modulation subspace-spanning filter bank features for robust automatic speech recognition
- M. R. Schädler, B. T. Meyer, and B. Kollmeier, "Spectro- temporal modulation subspace-spanning filter bank features for robust automatic speech recognition, " The Journal of the Acoustical Society of America, vol. 131, p. 4134, 2012.
- (2012) The Journal of the Acoustical Society of America , vol.131 , pp. 4134
- Schädler, M.R.¹ Meyer, B.T.² Kollmeier, B.³

11
- 0141624530
- An efficient auditory filterbank based on the gammatone function
- R. Patterson, I. Nimmo-Smith, J. Holdsworth, and P. Rice, "An efficient auditory filterbank based on the gammatone function, " APU report, vol. 2341, 1988.
- (1988) APU Report , vol.2341
- Patterson, R.¹ Nimmo-Smith, I.² Holdsworth, J.³ Rice, P.⁴

12
- 0032136330
- Robust speech recognition using the modulation spectrogram
- B. E. Kingsbury, N. Morgan, and S. Greenberg, "Robust speech recognition using the modulation spectrogram, " Speech Communication, vol. 25, no. 1, pp. 117-132, 1998.
- (1998) Speech Communication , vol.25 , Issue.1 , pp. 117-132
- Kingsbury, B.E.¹ Morgan, N.² Greenberg, S.³

13
- 0026626445
- Auditory representations of acoustic signals
- X. Yang, K. Wang, and S. A. Shamma, "Auditory representations of acoustic signals, " Information Theory, IEEE Transactions on, vol. 38, no. 2, pp. 824-839, 1992.
- (1992) Information Theory, IEEE Transactions on , vol.38 , Issue.2 , pp. 824-839
- Yang, X.¹ Wang, K.² Shamma, S.A.³

14
- 0026735337
- Shiftable multiscale transforms
- E. P. Simoncelli, W. T. Freeman, E. H. Adelson, and D. J. Heeger, "Shiftable multiscale transforms, " Information Theory, IEEE Transactions on, vol. 38, no. 2, pp. 587-607, 1992.
- (1992) Information Theory, IEEE Transactions on , vol.38 , Issue.2 , pp. 587-607
- Simoncelli, E.P.¹ Freeman, W.T.² Adelson, E.H.³ Heeger, D.J.⁴

15
- 0029487233
- The steerable pyramid: A flexible architecture for multi-scale derivative computation
- E. Simoncelli and W. Freeman, "The steerable pyramid: A flexible architecture for multi-scale derivative computation, " in Proc. ICIP, vol. 3, 1995, pp. 444-447.
- (1995) Proc. ICIP , vol.3 , pp. 444-447
- Simoncelli, E.¹ Freeman, W.²

16
- 84906256966
- Online
- E. Simoncelli. (2003) Steerable pyramid toolbox. [Online]. Available: http://www.cis.upenn.edu/eero/steerpyr.html.
- (2003) Steerable Pyramid Toolbox
- Simoncelli, E.¹

17
- 84906262880
- Online
- M. R. Schädler. (2011) Gabor filter bank (gbfb) feature extraction reference implementation in matlab. [Online]. Available: http://medi.uni- oldenburg.de/56428.html.
- (2011) Gabor Filter Bank (Gbfb) Feature Extraction Reference Implementation in Matlab
- Schädler, M.R.¹

18
- 4544279104
- The aurora experimental framework for the performance evaluation of speech recognition systems under noisy conditions
- H. Hirsch and D. Pearce, "The aurora experimental framework for the performance evaluation of speech recognition systems under noisy conditions, " in Automatic Speech Recognition: Challenges for the new Millenium ISCA Tutorial and Research Workshop (ITRW), 2000.
- (2000) Automatic Speech Recognition: Challenges for the New Millenium ISCA Tutorial and Research Workshop (ITRW)
- Hirsch, H.¹ Pearce, D.²

19
- 0003822743
- Cambridge University Engineering Department
- S. Young, G. Evermann, D. Kershaw, G. Moore, J. Odell, D. Ollason, V. Valtchev, and P. Woodland, "The htk book, " Cambridge University Engineering Department, vol. 3, 2002.
- (2002) The Htk Book , vol.3
- Young, S.¹ Evermann, G.² Kershaw, D.³ Moore, G.⁴ Odell, J.⁵ Ollason, D.⁶ Valtchev, V.⁷ Woodland, P.⁸

20
- 84883097102
- On the importance of various modulation frequencies for speech recognition
- N. Kanedera, T. Arai, H. Hermansky, and M. Pavel, "On the importance of various modulation frequencies for speech recognition, " in Proc. Eurospeech, vol. 97, 1997, pp. 1079-1082.
- (1997) Proc. Eurospeech , vol.97 , pp. 1079-1082
- Kanedera, N.¹ Arai, T.² Hermansky, H.³ Pavel, M.⁴

21
- 33646064275
- Multi-resolution rasta filtering for tandem-based asr
- H. Hermansky and P. Fousek, "Multi-resolution rasta filtering for tandem-based asr, " in Proc. Inter Speech, 2005.
- (2005) Proc. Inter Speech
- Hermansky, H.¹ Fousek, P.²

22
- 70450182191
- Tandem representations of spectral envelope and modulation frequency features for asr
- S. Thomas, S. Ganapathy, and H. Hermansky, "Tandem representations of spectral envelope and modulation frequency features for asr, " in Proc. Inter Speech, 2009.
- (2009) Proc. Inter Speech
- Thomas, S.¹ Ganapathy, S.² Hermansky, H.³

23
- 0034427366
- Curvelets, multi resolution representation, and scaling laws
- E. Candes and D. Donoho, "Curvelets, multiresolution representation, and scaling laws, " in Proc. SPIE, vol. 4119, no. 1, 2000.
- (2000) Proc. SPIE , vol.4119 , Issue.1
- Candes, E.¹ Donoho, D.²

24
- 28944432472
- The contourlet transform: An efficient directional multi resolution image representation
- M. Do and M. Vetterli, "The contourlet transform: An efficient directional multi resolution image representation, " Image Processing, IEEE Transactions on, vol. 14, no. 12, pp. 2091-2106, 2005.
- (2005) Image Processing, IEEE Transactions on , vol.14 , Issue.12 , pp. 2091-2106
- Do, M.¹ Vetterli, M.²

25
- 0030369274
- Inclusion of temporal information into features for speech recognition
- B. Milner, "Inclusion of temporal information into features for speech recognition, " in Proc. ICSLP, vol. 1, 1996, pp. 256-259.
- (1996) Proc. ICSLP , vol.1 , pp. 256-259
- Milner, B.¹

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.