메뉴 건너뛰기




Volumn , Issue , 2013, Pages 704-708

A robust frontend for VAD: Exploiting contextual, discriminative and spectral cues of human voice

Author keywords

Noise robust features; Speech activity detection

Indexed keywords

SIGNAL PROCESSING; SPEECH COMMUNICATION; SPEECH PROCESSING;

EID: 84906246377     PISSN: 2308457X     EISSN: 19909772     Source Type: Conference Proceeding    
DOI: None     Document Type: Conference Paper
Times cited : (62)

References (35)
  • 1
    • 33745188004 scopus 로고    scopus 로고
    • Voice activity detection based on optimally weighted combination of multiple features
    • Y. Kida and T. Kawahara, "Voice activity detection based on optimally weighted combination of multiple features, " in Proc. Interspeech, 2005, pp. 2621-2624.
    • (2005) Proc. Interspeech , pp. 2621-2624
    • Kida, Y.1    Kawahara, T.2
  • 2
    • 51449100230 scopus 로고    scopus 로고
    • A voice activity detection based on the adaptive integration of multiple speech features and a signal decision scheme
    • M. Fujimoto, K. Ishizuka, and T. Nakatani, "A voice activity detection based on the adaptive integration of multiple speech features and a signal decision scheme, " in Proc. ICASSP, 2008.
    • (2008) Proc. ICASSP
    • Fujimoto, M.1    Ishizuka, K.2    Nakatani, T.3
  • 6
    • 0019053271 scopus 로고
    • Comparison of parametric representations for monosyllabic word recognitions in continuously spoken sentences
    • Aug
    • S. Davis and P. Mermelstein, "Comparison of parametric representations for monosyllabic word recognitions in continuously spoken sentences, " IEEE Transactions on Acoustics, Speech and Signal Processing, vol. 28, no. 4, pp. 357-366, Aug. 1980.
    • (1980) IEEE Transactions on Acoustics, Speech and Signal Processing , vol.28 , Issue.4 , pp. 357-366
    • Davis, S.1    Mermelstein, P.2
  • 7
    • 0025041264 scopus 로고
    • Perceptual linear predictive (PLP) analysis of speech
    • Apr
    • H. Hermansky, "Perceptual linear predictive (PLP) analysis of speech, " Journal of the Acoustical Society of America, vol. 87, no. 4, pp. 1738-1752, Apr. 1990.
    • (1990) Journal of the Acoustical Society of America , vol.87 , Issue.4 , pp. 1738-1752
    • Hermansky, H.1
  • 8
    • 34547499683 scopus 로고    scopus 로고
    • Incorporating auditory feature uncertainties in robust speaker identification
    • Y. Shao, S. Srinivasan, and D. Wang, "Incorporating auditory feature uncertainties in robust speaker identification, " in Proc. ICASSP, 2002, pp. 277-280.
    • (2002) Proc. ICASSP , pp. 277-280
    • Shao, Y.1    Srinivasan, S.2    Wang, D.3
  • 9
    • 84890520795 scopus 로고    scopus 로고
    • Power-normalized coefficients (PNCC) for robust speech recognition
    • C. Kim and R. M. R. M. Stern, "Power-normalized coefficients (PNCC) for robust speech recognition, " in Proc. ICASSP, 2012.
    • (2012) Proc. ICASSP
    • Kim, C.1    Stern, R.M.R.M.2
  • 10
    • 70349223037 scopus 로고    scopus 로고
    • An auditory-based feature for robust speech recognition
    • Y. Shao, Z. Jin, D. L.Wang, and S. Srinivasan, "An auditory-based feature for robust speech recognition, " in Proc. ICASSP, 2009.
    • (2009) Proc. ICASSP
    • Shao, Y.1    Jin, Z.2    Wang, D.L.3    Srinivasan, S.4
  • 11
    • 34547539413 scopus 로고    scopus 로고
    • Gammatone features and feature combination for large vocabulary speech recognition
    • R. Schluter, I. Bezrukov, H. Wagner, and H. Ney, "Gammatone features and feature combination for large vocabulary speech recognition, " in Proc. ICASSP, 2007.
    • (2007) Proc. ICASSP
    • Schluter, R.1    Bezrukov, I.2    Wagner, H.3    Ney, H.4
  • 12
    • 0003235731 scopus 로고    scopus 로고
    • TRAPS - classifiers of temporal patterns
    • Sydney, Australia, Nov
    • H. Hermansky and S. Sharma, "TRAPS - classifiers of temporal patterns, " in Proc. Interspeech, Sydney, Australia, Nov. 1998, pp. 1003-1006.
    • (1998) Proc. Interspeech , pp. 1003-1006
    • Hermansky, H.1    Sharma, S.2
  • 13
    • 0034848926 scopus 로고    scopus 로고
    • Tandem acoustic modeling in large-vocabulary recognition
    • D. Ellis, R. Singh, and S. Sivadas, "Tandem acoustic modeling in large-vocabulary recognition, " in Proc. ICASSP, 2001.
    • (2001) Proc. ICASSP
    • Ellis, D.1    Singh, R.2    Sivadas, S.3
  • 14
    • 34547548235 scopus 로고    scopus 로고
    • Probabilistic and bottle-neck features for LVCSR of meetings
    • F. Grezl, M. Karafiat, S. Kontar, and J. Cernocky, "Probabilistic and bottle-neck features for LVCSR of meetings, " in Proc. ICASSP, 2007.
    • (2007) Proc. ICASSP
    • Grezl, F.1    Karafiat, M.2    Kontar, S.3    Cernocky, J.4
  • 15
    • 0032136330 scopus 로고    scopus 로고
    • Robust speech recognition using the modulation spectrogram
    • Aug
    • B. Kingsbury, N. Morgan, and S. Greenberg, "Robust speech recognition using the modulation spectrogram, " Speech Communication, vol. 25, pp. 117-132, Aug. 1998.
    • (1998) Speech Communication , vol.25 , pp. 117-132
    • Kingsbury, B.1    Morgan, N.2    Greenberg, S.3
  • 16
    • 0032658253 scopus 로고    scopus 로고
    • Temporal patterns (TRAPs) in ASR of noisy speech
    • Phoenix, Arizona, U.S.A. Mar
    • H. Hermansky and S. Sharma, "Temporal patterns (TRAPs) in ASR of noisy speech, " in Proc. ICASSP, vol. 1, Phoenix, Arizona, U.S.A., Mar. 1997, pp. 289-292.
    • (1997) Proc. ICASSP , vol.1 , pp. 289-292
    • Hermansky, H.1    Sharma, S.2
  • 17
    • 84906249759 scopus 로고    scopus 로고
    • Spectro-temporal gabor features as a front end for ASR
    • K. Mi, "Spectro-temporal gabor features as a front end for ASR, " in Proc. Forum Acusticum Sevilla, 2002.
    • (2002) Proc. Forum Acusticum Sevilla
    • Mi, K.1
  • 19
    • 33745213373 scopus 로고    scopus 로고
    • Multi-resolution RASTA filtering for TANDEM-based ASR
    • Lisbon, Portugal, Oct
    • H. Hermansky and P. Fousek, "Multi-resolution RASTA filtering for TANDEM-based ASR, " in Proc. Interspeech, Lisbon, Portugal, Oct. 2005, pp. 361-364.
    • (2005) Proc. Interspeech , pp. 361-364
    • Hermansky, H.1    Fousek, P.2
  • 20
    • 84867220821 scopus 로고    scopus 로고
    • Multi-stream spectro-temporal features for robust speech recognition
    • S. Zhao and N. Morgan, "Multi-stream spectro-temporal features for robust speech recognition, " in Proc. Interspeech, 2008, pp. 898-901.
    • (2008) Proc. Interspeech , pp. 898-901
    • Zhao, S.1    Morgan, N.2
  • 21
    • 70450182191 scopus 로고    scopus 로고
    • Tandem representations of spectral envelope and modulation frequency features for ASR
    • T. S. S. Ganapathy, and H. Hermansky, "Tandem representations of spectral envelope and modulation frequency features for ASR, " in Proc. Interspeech, 2009, pp. 2955-2958.
    • (2009) Proc. Interspeech , pp. 2955-2958
    • Ganapathy, T.S.S.1    Hermansky, H.2
  • 22
    • 84878395103 scopus 로고    scopus 로고
    • Longer features: They do a speech detector good
    • T. Tsai and N. Morgan, "Longer features: They do a speech detector good, " in Proc. Interspeech, 2012.
    • (2012) Proc. Interspeech
    • Tsai, T.1    Morgan, N.2
  • 23
    • 84890541926 scopus 로고    scopus 로고
    • A robust frontend for ASR: Combining denoising, noise masking and feature normalization
    • M. Van Segbroeck and S. Narayanan, "A robust frontend for ASR: combining denoising, noise masking and feature normalization, " in Proc. ICASSP, 2013.
    • (2013) Proc. ICASSP
    • Van Segbroeck, M.1    Narayanan, S.2
  • 24
    • 84865769808 scopus 로고    scopus 로고
    • Comparing different flavors of spectro-temporal features for ASR
    • B. T. Meyer, S. V. Ravuri, M. R. Schadler, and N. Morgan, "Comparing different flavors of spectro-temporal features for ASR, " in Proc. Interspeech, 2011, pp. 1269-1272.
    • (2011) Proc. Interspeech , pp. 1269-1272
    • Meyer, B.T.1    Ravuri, S.V.2    Schadler, M.R.3    Morgan, N.4
  • 25
    • 84865769808 scopus 로고    scopus 로고
    • Comparing different flavors of spectro-temporal features for ASR
    • B. Meyer, S. Ravuri, M. Schadler, and N. Morgan, "Comparing different flavors of spectro-temporal features for ASR, " in Proc. Interspeech, 2011, pp. 1269-1272.
    • (2011) Proc. Interspeech , pp. 1269-1272
    • Meyer, B.1    Ravuri, S.2    Schadler, M.3    Morgan, N.4
  • 26
    • 0036642777 scopus 로고    scopus 로고
    • Use of voicing features in hmm-based speech recognition
    • D. L. Thomson and R. Chengalvarayan, "Use of voicing features in hmm-based speech recognition, " Speech Communication, vol. 37, no. 3, pp. 197-211, 2002.
    • (2002) Speech Communication , vol.37 , Issue.3 , pp. 197-211
    • Thomson, D.L.1    Chengalvarayan, R.2
  • 27
    • 85009188485 scopus 로고    scopus 로고
    • Extraction methods of voicing feature for robust speech recognition
    • A. Zolnay, R. Schulter, and H. Ney, "Extraction methods of voicing feature for robust speech recognition, " in Proceedings of EUROSPEECH, 2003, pp. 497-500.
    • (2003) Proceedings of EUROSPEECH , pp. 497-500
    • Zolnay, A.1    Schulter, R.2    Ney, H.3
  • 28
    • 77956739501 scopus 로고    scopus 로고
    • Long-term spectrotemporal and static harmonic features for voice activity detection
    • T. Fukuda, O. Ichikawa, and M. Nishimura, "Long-term spectrotemporal and static harmonic features for voice activity detection, " Selected Topics in Signal Processing, IEEE Journal of, vol. 4, no. 5, pp. 834-844, 2010.
    • (2010) Selected Topics in Signal Processing, IEEE Journal of , vol.4 , Issue.5 , pp. 834-844
    • Fukuda, T.1    Ichikawa, O.2    Nishimura, M.3
  • 31
    • 4544315110 scopus 로고    scopus 로고
    • Robust speech recognition using cepstral domain missing data techniques and noisy masks
    • Montreal, Canada, May
    • H. Van Hamme, "Robust speech recognition using cepstral domain missing data techniques and noisy masks, " in Proc. ICASSP, Montreal, Canada, May 2004, pp. 213-216.
    • (2004) Proc. ICASSP , pp. 213-216
    • Van Hamme, H.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.