메뉴 건너뛰기




Volumn , Issue , 2014, Pages 2435-2439

Should Deep Neural nets have ears? The role of auditory features in deep learning approaches

Author keywords

Amplitude modulation filter bank; Deep learning; Deep neural network; Gabor features; Speech recognition

Indexed keywords

AMPLITUDE MODULATION; ARTIFICIAL INTELLIGENCE; FILTER BANKS; GABOR FILTERS; LEARNING SYSTEMS; SIGNAL PROCESSING; SPEECH COMMUNICATION;

EID: 84910029373     PISSN: 2308457X     EISSN: 19909772     Source Type: Conference Proceeding    
DOI: None     Document Type: Conference Paper
Times cited : (17)

References (30)
  • 1
    • 0031187171 scopus 로고    scopus 로고
    • Speech recognition by machines and humans
    • R. Lippmann, "Speech recognition by machines and humans, " Speech Commun., vol. 22, no. 1, pp. 1-15, 1997.
    • (1997) Speech Commun , vol.22 , Issue.1 , pp. 1-15
    • Lippmann, R.1
  • 2
    • 34247580087 scopus 로고    scopus 로고
    • Reaching over the gap: A review of efforts to link human and automatic speech recognition research
    • O. Scharenborg, "Reaching over the gap: A review of efforts to link human and automatic speech recognition research, " Speech Commun., pp. 336-347, 2007.
    • (2007) Speech Commun , pp. 336-347
    • Scharenborg, O.1
  • 3
    • 79953659090 scopus 로고    scopus 로고
    • Robustness of spectro-temporal features against intrinsic and extrinsic variations in automatic speech recognition
    • B. Meyer and B. Kollmeier, "Robustness of spectro-temporal features against intrinsic and extrinsic variations in automatic speech recognition, " Speech Commun., vol. 53, pp. 753-767, 2011.
    • (2011) Speech Commun , vol.53 , pp. 753-767
    • Meyer, B.1    Kollmeier, B.2
  • 4
    • 84867585919 scopus 로고    scopus 로고
    • Understanding how deep belief networks perform acoustic modelling
    • A. R. Mohamed, G. Hinton and G. Penn, "Understanding how deep belief networks perform acoustic modelling, " in Proc. ICASSP, 2012.
    • (2012) Proc. ICASSP
    • Mohamed, A.R.1    Hinton, G.2    Penn, G.3
  • 5
    • 84055222005 scopus 로고    scopus 로고
    • Context-dependent pretrained deep neural networks for large-vocabulary speech recognition. Audio, Speech, and Language Processing
    • G. Dahl, D. Yu, L. Deng and A. Acero, "Context-dependent pretrained deep neural networks for large-vocabulary speech recognition. Audio, Speech, and Language Processing, " IEEE Transactions, vol. 20, no. 1, pp. 30-42, 2012.
    • (2012) IEEE Transactions , vol.20 , Issue.1 , pp. 30-42
    • Dahl, G.1    Yu, D.2    Deng, L.3    Acero, A.4
  • 8
    • 0033709098 scopus 로고    scopus 로고
    • Tandem connectionist feature extraction for conventional HMM systems
    • H. Hermansky, D. Ellis and S. Sharma, "Tandem connectionist feature extraction for conventional HMM systems, " in Proc. Interspeech, 2000.
    • (2000) Proc. Interspeech
    • Hermansky, H.1    Ellis, D.2    Sharma, S.3
  • 9
    • 0033709098 scopus 로고    scopus 로고
    • Tandem connectionist feature extraction for conventional HMM systems
    • H. Hermansky, D. Ellis and S. Sharma, "Tandem connectionist feature extraction for conventional HMM systems, " in Proc. Interspeech, 2000.
    • (2000) Proc. Interspeech
    • Hermansky, H.1    Ellis, D.2    Sharma, S.3
  • 10
    • 70450205161 scopus 로고    scopus 로고
    • Feature extraction for robust speech recognition using a power-law nonlinearity and power-bias subtraction
    • C. Kim and R. M. Stern, "Feature extraction for robust speech recognition using a power-law nonlinearity and power-bias subtraction, " in Proc. Interspeech, 2009.
    • (2009) Proc. Interspeech
    • Kim, C.1    Stern, R.M.2
  • 12
    • 0037824480 scopus 로고    scopus 로고
    • Gabor analysis of auditory mid- brain receptive fields: Spectro-temporal and binaural composition
    • A. Qiu, C. Schreiner and M. Escabi, "Gabor analysis of auditory mid- brain receptive fields: Spectro-temporal and binaural composition, " Journal of Neurophysiology, vol. 90, pp. 456-476, 2003.
    • (2003) Journal of Neurophysiology , vol.90 , pp. 456-476
    • Qiu, A.1    Schreiner, C.2    Escabi, M.3
  • 13
    • 34547509128 scopus 로고    scopus 로고
    • Representation of phonemes in primary auditory cortex: How the brain analyzes speech
    • N. Mesgarani, D. Stephen and S. Shamma, "Representation of phonemes in primary auditory cortex: How the brain analyzes speech, " in Proc. ICASSP, 2007.
    • (2007) Proc. ICASSP
    • Mesgarani, N.1    Stephen, D.2    Shamma, S.3
  • 14
    • 27144544136 scopus 로고    scopus 로고
    • Improving word accuracy with Gabor feature extraction
    • M. Kleinschmidt and D.Gelbart, "Improving word accuracy with Gabor feature extraction, " in Proc. Interspeech, 2002.
    • (2002) Proc. Interspeech
    • Kleinschmidt, M.1    Gelbart, D.2
  • 15
    • 84863799482 scopus 로고    scopus 로고
    • Spectrotemporal modulation subspace-spanning filter bank features for robust automatic speech recognition
    • M. R. Schädler, B. Kollmeier and B. T. Meyer, "Spectrotemporal modulation subspace-spanning filter bank features for robust automatic speech recognition, " J. Acoust. Soc. Am., pp. 4134-4151, 2011.
    • (2011) J. Acoust. Soc. Am. , pp. 4134-4151
    • Schädler, M.R.1    Kollmeier, B.2    Meyer, B.T.3
  • 16
    • 84878415523 scopus 로고    scopus 로고
    • Hooking up spectro-temporal filters with auditory-inspired representations for robust automatic speech recognition
    • B. Meyer, C. Spille, B. Kollmeier and N. Morgan, "Hooking up spectro-temporal filters with auditory-inspired representations for robust automatic speech recognition, " in Proc. Interspeech, 2012.
    • (2012) Proc. Interspeech
    • Meyer, B.1    Spille, C.2    Kollmeier, B.3    Morgan, N.4
  • 17
    • 84867619222 scopus 로고    scopus 로고
    • Spectro-temporal Gabor features for speaker recognition
    • H. Lei, B. Meyer and N. Mirghafori, "Spectro-temporal Gabor features for speaker recognition, " in Proc. ICASSP, 2012.
    • (2012) Proc. ICASSP
    • Lei, H.1    Meyer, B.2    Mirghafori, N.3
  • 18
    • 0024241221 scopus 로고
    • Periodicity coding in the inferior Colliculus of the cat. I. Neuronal mechanisms
    • G. Langner and C. Schreiner, "Periodicity coding in the inferior Colliculus of the cat. I. Neuronal mechanisms, " J. of Neurophysiology, vol. 60, pp. 1799-1822, 1988.
    • (1988) J. of Neurophysiology , vol.60 , pp. 1799-1822
    • Langner, G.1    Schreiner, C.2
  • 19
    • 0028297185 scopus 로고
    • Speech enhancement based on physiological and psychoacoustical models of modulation perception and binaural interaction
    • B. Kollmeier and R. Koch, "Speech enhancement based on physiological and psychoacoustical models of modulation perception and binaural interaction, " J. Acoust. Soc. Am., vol. 95, no. 3, pp. 1593-1602, 1994.
    • (1994) J. Acoust. Soc. Am , vol.95 , Issue.3 , pp. 1593-1602
    • Kollmeier, B.1    Koch, R.2
  • 20
    • 0030691985 scopus 로고    scopus 로고
    • Modeling auditory processing of amplitude modulation I. Detection and masking with narrow-band carriers
    • T. Dau, B. Kollmeier and A. Kohlrausch, "Modeling Auditory Processing of Amplitude Modulation. I. Detection and Masking with Narrow-Band Carriers, " J. Acoustic Soc. Am., vol. 102, no. 5, p. 2892-2905, 1997.
    • (1997) J. Acoustic Soc. Am. , vol.102 , Issue.5 , pp. 2892-2905
    • Dau, T.1    Kollmeier, B.2    Kohlrausch, A.3
  • 21
    • 84955462883 scopus 로고    scopus 로고
    • Robust ASR in reverberant environments using temporal cepstrum smoothing for speech enhancement and an amplitude modulation filterbank for feature extraction
    • F. Xiong, N. Moritz, R. Rehr, J. Anemüller, B. Meyer, T. Gerkmann, S. Doclo and S. Goetze, "Robust ASR in Reverberant Environments Using Temporal Cepstrum Smoothing for Speech Enhancement and an Amplitude Modulation Filterbank for Feature Extraction, " in Proc. REVERB Workshop, 2014.
    • (2014) Proc. REVERB Workshop
    • Xiong, F.1    Moritz, N.2    Rehr, R.3    Anemüller, J.4    Meyer, B.5    Gerkmann, T.6    Doclo, S.7    Goetze, S.8
  • 22
    • 79551679242 scopus 로고    scopus 로고
    • Effect of speech-intrinsic variations on human and automatic recognition of spoken phonemes
    • B. Meyer, T. Brand and B. Kollmeier, "Effect of speech-intrinsic variations on human and automatic recognition of spoken phonemes, " J. Acoust. Soc. Am., vol. 129, pp. 388-403, 2011.
    • (2011) J. Acoust. Soc. Am , vol.129 , pp. 388-403
    • Meyer, B.1    Brand, T.2    Kollmeier, B.3
  • 23
    • 80051627812 scopus 로고    scopus 로고
    • Amplitude Modulation Spectrogram based Features for Robust Speech Recognition in Noisy and Reverberant Environments
    • N. Moritz, J. Anemüller and B. Kollmeier, "Amplitude Modulation Spectrogram based Features for Robust Speech Recognition in Noisy and Reverberant Environments, " in Proc. ICASSP, 2011.
    • (2011) Proc. ICASSP
    • Moritz, N.1    Anemüller, J.2    Kollmeier, B.3
  • 24
    • 84878415009 scopus 로고    scopus 로고
    • Amplitude modulation filters as feature sets for robust ASR: Constant absolute or relative bandwidth?
    • Portland, USA
    • N. Moritz, J. Anemüller and B. Kollmeier, "Amplitude Modulation Filters as Feature Sets for Robust ASR: Constant Absolute or Relative Bandwidth?, " in Proc. Interspeech, Portland, USA, 2012.
    • (2012) Proc. Interspeech
    • Moritz, N.1    Anemüller, J.2    Kollmeier, B.3
  • 25
    • 0019053271 scopus 로고
    • Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences
    • S. Davis and P. Mermelstein, "Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences, " IEEE Transactions on Acoustics, Speech, and Signal Processing, vol. 28, no. 4, pp. 357-366., 1980.
    • (1980) IEEE Transactions on Acoustics, Speech, and Signal Processing , vol.28 , Issue.4 , pp. 357-366
    • Davis, S.1    Mermelstein, P.2
  • 28
    • 84861125212 scopus 로고    scopus 로고
    • A practical guide to training restricted Boltzmann machines
    • G. Hinton, "A practical guide to training restricted Boltzmann machines, " Momentum, vol. 9, no. 1, p. 926, 2010.
    • (2010) Momentum , vol.9 , Issue.1
    • Hinton, G.1
  • 30
    • 56149125973 scopus 로고    scopus 로고
    • Aurora working group: DSR front end LVCSR evaluation AU/384/02, Inst. For Signal and Information Process
    • N. Parihar and J. Picone, Aurora working group: DSR front end LVCSR evaluation AU/384/02, Inst. for Signal and Information Process, Mississippi State University, Technical Report, 2002.
    • (2002) Mississippi State University, Technical Report
    • Parihar, N.1    Picone, J.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.