메뉴 건너뛰기




Volumn 53, Issue 5, 2011, Pages 690-706

Robust speech detection in real acoustic backgrounds with perceptually motivated features

Author keywords

Amplitude modulations; Fluctuating noise; Pattern classification; Real world scenario; Speech detection

Indexed keywords

BACKGROUND NOISE; BASE-LINE CONDITIONS; DETECTION PERFORMANCE; EXTERNAL CONDITIONS; FEATURE TYPES; FLUCTUATING NOISE; FOURIER DECOMPOSITION; GENERALISATION; HIERARCHICAL APPROACH; LINEAR KERNEL; MEL-FREQUENCY CEPSTRAL COEFFICIENTS; NONLINEAR KERNELS; PATTERN CLASSIFICATION; PERCEPTUAL LINEAR PREDICTIONS; REAL-WORLD SCENARIO; ROBUST SPEECH; SIGNAL TO NOISE; SPECTRAL FEATURE; SPECTROGRAMS; SPEECH DETECTION; SPEECH/NONSPEECH CLASSIFICATION; TRAINING AND TESTING;

EID: 79953670471     PISSN: 01676393     EISSN: None     Source Type: Journal    
DOI: 10.1016/j.specom.2010.07.003     Document Type: Article
Times cited : (28)

References (39)
  • 1
    • 79953661964 scopus 로고    scopus 로고
    • Biologically motivated audio-visual cue integration for object categorization
    • (CogSys) (Karlsruhe, Germany)
    • Anemüller, J., Bach, J.H., Caputo, B., Jie, L., Ohl, F., Orabona, F., Vogels, R., Weinshall, D., Zweig, A., 2008a. Biologically motivated audio-visual cue integration for object categorization. In: First Internat. Conf. on Cognitive Systems (CogSys) (Karlsruhe, Germany).
    • (2008) First Internat. Conf. on Cognitive Systems
    • Anemüller, J.1
  • 2
    • 84867220368 scopus 로고    scopus 로고
    • Detection of speech embedded in real acoustic background based on amplitude modulation spectrogram features
    • (Brisbane, Australia)
    • Anemüller, J., Schmidt, D., Bach, J.-H., 2008b. Detection of speech embedded in real acoustic background based on amplitude modulation spectrogram features. In: Proc. Interspeech (Brisbane, Australia).
    • (2008) Proc. Interspeech
    • Anemüller, J.1
  • 3
    • 3142699853 scopus 로고    scopus 로고
    • Primitive auditory stream segregation: A neurophysiological study in the songbird forebrain
    • DOI 10.1152/jn.00884.2003
    • M.A. Bee, and G.M. Klump Primitive auditory stream segregation: a neurophysiological study in the songbird forebrain J. Neurophysiol. 92 2004 1088 1104 (Pubitemid 38931970)
    • (2004) Journal of Neurophysiology , vol.92 , Issue.2 , pp. 1088-1104
    • Bee, M.A.1    Klump, G.M.2
  • 5
    • 33751303502 scopus 로고    scopus 로고
    • Sound classification in hearing aids inspired by auditory scene analysis
    • DOI 10.1155/ASP.2005.2991
    • M. Büchler, S. Allegro, S. Launer, and N. Dillier Sound classification in hearing aids inspired by auditory scene analysis EURASIP J. Appl. Signal Process. 18 2005 2991 3002 (Pubitemid 44796363)
    • (2005) Eurasip Journal on Applied Signal Processing , vol.2005 , Issue.18 , pp. 2991-3002
    • Buchler, M.1    Allegro, S.2    Launer, S.3    Dillier, N.4
  • 7
    • 0019053271 scopus 로고
    • Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences
    • S. Davis, and P. Mermelstein Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences IEEE Trans. Acoust. Speech Signal Process. 28 1980 357 366 (Pubitemid 11464930)
    • (1980) IEEE Transactions on Acoustics, Speech, and Signal Processing , vol.ASSP-28 , Issue.4 , pp. 357-366
    • Davis Steven, B.1    Mermelstein Paul2
  • 8
    • 79953646083 scopus 로고    scopus 로고
    • Artificial noise signals with speechlike spectral and temporal properties for hearing instrument assessment
    • W.A. Dreschler, C. Ludvigson, and S. Westermann Artificial noise signals with speechlike spectral and temporal properties for hearing instrument assessment J. Acoust. Soc. Amer. 105 1999 1296
    • (1999) J. Acoust. Soc. Amer. , vol.105 , pp. 1296
    • Dreschler, W.A.1    Ludvigson, C.2    Westermann, S.3
  • 11
    • 0025383284 scopus 로고
    • Recognition of isolated words based on psychoacoustics and neurobiology
    • DOI 10.1016/0167-6393(90)90043-9
    • T. Gramss, and H.W. Strube Recognition of isolated words based one psychoacoustics and neurobiology Speech Commun. 9 1990 35 40 (Pubitemid 20717751)
    • (1990) Speech Communication , vol.9 , Issue.1 , pp. 35-40
    • Gramss Tino1    Strube Hans Werner2
  • 12
    • 0030711174 scopus 로고    scopus 로고
    • The modulation spectrogram: In pursuit of an invariant representation of speech
    • Munich, Germany
    • Greenberg, S., Kingsbury, B., 1997. The modulation spectrogram: in pursuit of an invariant representation of speech. In: Proc. ICASSP (Munich, Germany).
    • (1997) Proc. ICASSP
    • Greenberg, S.1    Kingsbury, B.2
  • 13
    • 84867196898 scopus 로고    scopus 로고
    • Predictability of STRFs in auditory cortex neurons depends on stimulus class
    • Brisbane, Australia
    • Happel, M.F.K., Müller, S., Anemüller, J., W.Ohl, F. (2008). Predictability of STRFs in auditory cortex neurons depends on stimulus class. In Proc. InterSpeech (Brisbane, Australia), p. 670.
    • (2008) Proc. InterSpeech , pp. 670
    • Happel, M.F.K.1
  • 14
    • 0025041264 scopus 로고
    • Perceptual linear predictive (PLP) analysis of speech
    • DOI 10.1121/1.399423
    • H. Hermansky Perceptual Linear Prediction (PLP) Analysis for Speech J. Acoust. Soc. Amer. 87 1990 1738 1752 (Pubitemid 20256470)
    • (1990) Journal of the Acoustical Society of America , vol.87 , Issue.4 , pp. 1738-1752
    • Hermansky, H.1
  • 16
    • 79953663141 scopus 로고    scopus 로고
    • G.729: A silence compression scheme for G.729 optimized for terminals conforming to Recommendation V.70
    • ITU I.T.U.
    • ITU, I.T.U., 1996. G.729: a silence compression scheme for G.729 optimized for terminals conforming to Recommendation V.70. In: Recommendation G.x729 Annex B.
    • (1996) Recommendation G.x729 Annex B
  • 17
    • 0030682292 scopus 로고    scopus 로고
    • Recognizing reverberant speech with RASTA-PLP
    • Munich, Germany
    • Kingsbury, B.E.D., Morgan, N., 1997. Recognizing reverberant speech with RASTA-PLP. In: Proc. ICASSP (Munich, Germany), pp. 1259-1262.
    • (1997) Proc. ICASSP , pp. 1259-1262
    • Kingsbury, B.E.D.1    Morgan, N.2
  • 18
    • 0032136330 scopus 로고    scopus 로고
    • Robust speech recognition using the modulation spectrogram
    • PII S0167639398000326
    • B.E.D. Kingsbury, N. Morgan, and S. Greenberg Robust speech recognition using the modulation spectrogram Speech Commun. 25 1998 117 132 (Pubitemid 128413637)
    • (1998) Speech Communication , vol.25 , Issue.1-3 , pp. 117-132
    • Kingsbury, B.E.D.1    Morgan, N.2    Greenberg, S.3
  • 20
    • 0028297185 scopus 로고
    • Speech enhancement based on physiological and psychoacoustical models of modulation perception and binaural interaction
    • B. Kollmeier, and R. Koch Speech enhancement based on physiological and psychoacoustical models of modulation perception and binaural interaction J. Acoust. Soc. Amer. 95 1994 1593 1602 (Pubitemid 24085760)
    • (1994) Journal of the Acoustical Society of America , vol.95 , Issue.3 , pp. 1593-1602
    • Kollmeier, B.1    Koch, R.2
  • 22
    • 34548160247 scopus 로고    scopus 로고
    • A note on Platt's probabilistic outputs for support vector machines
    • DOI 10.1007/s10994-007-5018-6
    • H.-T. Lin, C.-J. Lin, and R. Weng A note on Platt's probabilistic outputs for support vector machine Machine Learning 68 2007 267 276 (Pubitemid 47312490)
    • (2007) Machine Learning , vol.68 , Issue.3 , pp. 267-276
    • Lin, H.-T.1    Lin, C.-J.2    Weng, R.C.3
  • 23
    • 79953656775 scopus 로고    scopus 로고
    • Object category detection using audio-visual cues
    • Santorini, Greece
    • Luo, J., Caputo, B., Zweig, A., Bach, J.-H., Anemüller, J., 2008. Object category detection using audio-visual cues. In: Proc. ICVS (Santorini, Greece).
    • (2008) Proc. ICVS
    • Luo, J.1
  • 24
    • 34547541453 scopus 로고    scopus 로고
    • Unsupervised speech/non-speech detection for automatic speech recognition in meeting rooms
    • Honolulu
    • Maganti, H.K., Motlicek, P., Perez, D.G., 2007. Unsupervised speech/non-speech detection for automatic speech recognition in meeting rooms. In: Proc. ICASSP (Honolulu).
    • (2007) Proc. ICASSP
    • Maganti, H.K.1    Motlicek, P.2    Perez, D.G.3
  • 27
    • 0036476655 scopus 로고    scopus 로고
    • Speech pause detection for noise spectrum estimation by tracking power envelope dynamics
    • DOI 10.1109/89.985548, PII S1063667602015237
    • M. Marzinzik, and B. Kollmeier Speech pause detection for noise spectrum estimation by tracking power envelope dynamics IEEE Trans. Speech Audio Process. 10 2002 109 118 (Pubitemid 34295270)
    • (2002) IEEE Transactions on Speech and Audio Processing , vol.10 , Issue.2 , pp. 109-118
    • Marzinzik, M.1    Kollmeier, B.2
  • 29
    • 34047272330 scopus 로고    scopus 로고
    • Discrimination of speech from nonspeech based on multiscale spectro-temporal modulations
    • DOI 10.1109/TSA.2005.858055
    • N. Mesgarani, M. Slaney, and S.A. Shamma Discrimination of speech from nonspeech based on multiscale spectro-temporal modulations IEEE Trans. Audio Speech Lang. Process. 14 2006 920 930 (Pubitemid 46547653)
    • (2006) IEEE Transactions on Audio, Speech and Language Processing , vol.14 , Issue.3 , pp. 920-930
    • Mesgarani, N.1    Slaney, M.2    Shamma, S.A.3
  • 30
    • 84867224940 scopus 로고    scopus 로고
    • Optimization and evaluation of Gabor feature sets for ASR
    • Brisbane, Australia
    • Meyer, B.T., Kollmeier, B., 2008. Optimization and evaluation of Gabor feature sets for ASR. In: Proc. InterSpeech (Brisbane, Australia), pp. 906-909.
    • (2008) Proc. InterSpeech , pp. 906-909
    • Meyer, B.T.1    Kollmeier, B.2
  • 31
    • 0009931946 scopus 로고    scopus 로고
    • Classification of acoustical signals based on the analysis of modulation spectra for the application in digital hearing aids
    • Zurich, Switzerland
    • Ostendorf, M., Hohmann, V., Kollmeier, B., 1998. Classification of acoustical signals based on the analysis of modulation spectra for the application in digital hearing aids. In: Proc. DAGA (Zurich, Switzerland), pp. 402-403.
    • (1998) Proc. DAGA , pp. 402-403
    • Ostendorf, M.1    Hohmann, V.2    Kollmeier, B.3
  • 33
    • 0016470107 scopus 로고
    • An algorithm for determining the endpoints of isolated utterances
    • L.R. Rabiner, and M.R. Sambur An algorithm for determining the endpoints of isolated utterances Bell Syst. Tech. J. 54 1975 297 315
    • (1975) Bell Syst. Tech. J. , vol.54 , pp. 297-315
    • Rabiner, L.R.1    Sambur, M.R.2
  • 34
    • 0024263459 scopus 로고
    • Periodicity coding in the inferior colliculus of the cat. II. Topographical organization
    • C.E. Schreiner, and G. Langner Periodicity coding in the inferior colliculus of the cat. II. Topographical organization J. Neurophysiol. 60 1988 1823 1840 (Pubitemid 19017452)
    • (1988) Journal of Neurophysiology , vol.60 , Issue.6 , pp. 1823-1840
    • Schreiner, C.E.1    Langner, G.2
  • 35
    • 0033693067 scopus 로고    scopus 로고
    • Data-driven RASTA Filters in Reverberation
    • Istanbul
    • Shire, M.L., Chen, B.Y., 2000. Data-driven RASTA Filters in Reverberation. In: ICASSP (Istanbul).
    • (2000) ICASSP
    • Shire, M.L.1    Chen, B.Y.2
  • 36
    • 0038712550 scopus 로고    scopus 로고
    • SNR estimation based on amplitude modulation analysis with applications to noise suppression
    • J. Tchorz, and B. Kollmeier SNR estimation based on amplitude modulation analysis with applications to noise suppression IEEE Trans. Speech Audio Process. 11 2003 184 192
    • (2003) IEEE Trans. Speech Audio Process , vol.11 , pp. 184-192
    • Tchorz, J.1    Kollmeier, B.2
  • 39
    • 84953656445 scopus 로고
    • Subdivision of the audible frequency range into critical bands
    • E. Zwicker Subdivision of the audible frequency range into critical bands J. Acoust. Soc. Amer. 33 1961 248
    • (1961) J. Acoust. Soc. Amer. , vol.33 , pp. 248
    • Zwicker, E.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.