메뉴 건너뛰기




Volumn 4, Issue 5, 2010, Pages 834-844

Long-term spectro-temporal and static harmonic features for voice activity detection

Author keywords

Average phoneme duration; harmonic structure; long term temporal information; voice activity detection (VAD)

Indexed keywords

ACOUSTIC INFORMATION; AUTOMATIC SPEECH RECOGNITION SYSTEM; AVERAGE PHONEME DURATION; CEPSTRAL DOMAIN; ERROR REDUCTION; HARMONIC STRUCTURE; HARMONIC STRUCTURES; HUMAN VOICE; LOW SNR; MODEL-BASED; NOISE ROBUSTNESS; STATISTICAL MODELS; STRUCTURE-BASED; TEMPORAL FEATURES; TEMPORAL INFORMATION; VOICE ACTIVITY DETECTION; WORD ERROR RATE;

EID: 77956739501     PISSN: 19324553     EISSN: None     Source Type: Journal    
DOI: 10.1109/JSTSP.2010.2069750     Document Type: Article
Times cited : (42)

References (26)
  • 4
    • 0032762471 scopus 로고    scopus 로고
    • A statistical model-based voice activity detection
    • Jan
    • J. Sohn, N. S. Kim, and W. Sung, "A statistical model-based voice activity detection," IEEE Signal Process. Lett., vol.6, no.1, pp. 1-3, Jan. 1999.
    • (1999) IEEE Signal Process. Lett. , vol.6 , Issue.1 , pp. 1-3
    • Sohn, J.1    Kim, N.S.2    Sung, W.3
  • 5
    • 0035481845 scopus 로고    scopus 로고
    • Analysis and improvement of a statistical model-based voice activity detector
    • Oct
    • Y. D. Cho and A. Kondoz, "Analysis and improvement of a statistical model-based voice activity detector," IEEE Signal Process. Lett., vol.8, no.10, pp. 276-278, Oct. 2001.
    • (2001) IEEE Signal Process. Lett. , vol.8 , Issue.10 , pp. 276-278
    • Cho, Y.D.1    Kondoz, A.2
  • 6
    • 84867208777 scopus 로고    scopus 로고
    • Study of integration of statistical model-based voice activity detection and noise suppression
    • M. Fujimoto, K. Ishizuka, and T. Nakatani, "Study of integration of statistical model-based voice activity detection and noise suppression," Proc. Interspeech, pp. 2008-2011, 2008.
    • (2008) Proc. Interspeech , pp. 2008-2011
    • Fujimoto, M.1    Ishizuka, K.2    Nakatani, T.3
  • 7
    • 0034854659 scopus 로고    scopus 로고
    • Robust speech/non-speech detection using LDA applied to MFCC
    • A. Martin, D. Charlet, and M. Manuuary, "Robust speech/non-speech detection using LDA applied to MFCC," in Proc. ICASSP, 2001, vol.I, pp. 237-240.
    • (2001) Proc. ICASSP , vol.1 , pp. 237-240
    • Martin, A.1    Charlet, D.2    Manuuary, M.3
  • 8
    • 27644475276 scopus 로고    scopus 로고
    • An improved voice activity detection using higher order statistics
    • May
    • K. Li, M. N. S. Swamy, and M. O. Ahmad, "An improved voice activity detection using higher order statistics," IEEE Trans. Speech Audio Process., vol.13, no.5, pp. 965-974, May 2005.
    • (2005) IEEE Trans. Speech Audio Process. , vol.13 , Issue.5 , pp. 965-974
    • Li, K.1    Swamy, M.N.S.2    Ahmad, M.O.3
  • 9
    • 33947627138 scopus 로고    scopus 로고
    • Robust endpoint detection for speech recognition based on discriminative feature extraction
    • K. Yamamoto, F. Jabloun, K. Reinhard, and A. Kawamura, "Robust endpoint detection for speech recognition based on discriminative feature extraction," in Proc. ICASSP, 2006, vol.I, pp. 805-808.
    • (2006) Proc. ICASSP , vol.1 , pp. 805-808
    • Yamamoto, K.1    Jabloun, F.2    Reinhard, K.3    Kawamura, A.4
  • 10
    • 0032658253 scopus 로고    scopus 로고
    • TRAPS-Classifiers of temporal patterns
    • H. Hermansky and S. Sharma, "TRAPS-Classifiers of temporal patterns," in Proc. ICASSP, 1999, vol.I, pp. 289-292.
    • (1999) Proc. ICASSP , vol.1 , pp. 289-292
    • Hermansky, H.1    Sharma, S.2
  • 11
    • 84867218137 scopus 로고    scopus 로고
    • Short-and long-term dynamic features for robust speech recognition
    • T. Fukuda, O. Ichikawa, and M. Nishimura, "Short-and long-term dynamic features for robust speech recognition," Proc. Interspeech, pp. 2262-2265, 2008.
    • (2008) Proc. Interspeech , pp. 2262-2265
    • Fukuda, T.1    Ichikawa, O.2    Nishimura, M.3
  • 12
    • 1842476689 scopus 로고    scopus 로고
    • Efficient voice activity detection algorithms using long-term speech information
    • J. Ramirez, J. C. Segura, C. Benitez, A. Torre, and A. Rubio, "Efficient voice activity detection algorithms using long-term speech information," Speech Commun., vol.42, pp. 271-287, 2004.
    • (2004) Speech Commun , vol.42 , pp. 271-287
    • Ramirez, J.1    Segura, J.C.2    Benitez, C.3    Torre, A.4    Rubio, A.5
  • 13
    • 51449092667 scopus 로고    scopus 로고
    • Robust automatic continuous-speech recognition based on a voiced-unvoiced decision
    • H. Tolba and D. O'Shaughnessy, "Robust automatic continuous-speech recognition based on a voiced-unvoiced decision," Proc. ICSLP, 1998, paper 0342.
    • (1998) Proc. ICSLP , pp. 0342
    • Tolba, H.1    O'Shaughnessy, D.2
  • 14
    • 0034841228 scopus 로고    scopus 로고
    • Perceptual harmonic cepstral coefficients for speech recognition in noisy environment
    • L. Gu and K. Rose, "Perceptual harmonic cepstral coefficients for speech recognition in noisy environment," in Proc. ICASSP, 2001, vol.1, pp. 125-128.
    • (2001) Proc. ICASSP , vol.1 , pp. 125-128
    • Gu, L.1    Rose, K.2
  • 15
    • 85164649882 scopus 로고    scopus 로고
    • Noise robust front-end processing with voice activity detection based on periodic to aperiodic component ratio
    • K. Ishizuka, T. Nakatani, M. Fujimoto, and N. Miyazaki, "Noise robust front-end processing with voice activity detection based on periodic to aperiodic component ratio," Proc. Interspeech, pp. 230-233, 2007.
    • (2007) Proc. Interspeech , pp. 230-233
    • Ishizuka, K.1    Nakatani, T.2    Fujimoto, M.3    Miyazaki, N.4
  • 16
    • 77956755408 scopus 로고    scopus 로고
    • Robust voice activity detection based on adaptive sub-band energy sequence analysis and harmonic detection
    • Y. Guo, Q. Fu, and Y. Yan, "Robust voice activity detection based on adaptive sub-band energy sequence analysis and harmonic detection," Proc. Interspeech, pp. 2949-2952, 2007.
    • (2007) Proc. Interspeech , pp. 2949-2952
    • Guo, Y.1    Fu, Q.2    Yan, Y.3
  • 17
    • 0021645331 scopus 로고
    • Speech enhancement using a minimum mean-square error short-time spectral amplitude estimator
    • Dec
    • Y. Ephraim and D. Malah, "Speech enhancement using a minimum mean-square error short-time spectral amplitude estimator," IEEE Trans. Acoust., Speech, Signal Process., vol.ASSP-32, no.6, pp. 1109-1121, Dec. 1984.
    • (1984) IEEE Trans. Acoust., Speech, Signal Process. , vol.ASSP-32 , Issue.6 , pp. 1109-1121
    • Ephraim, Y.1    Malah, D.2
  • 19
    • 0028287770 scopus 로고
    • Effect of reducing slow temporal modulations on speech perception
    • R. Drullman, J. M. Festen, and R. Plomp, "Effect of reducing slow temporal modulations on speech perception," J. Acoust. Soc. Amer., vol.95, pp. 2670-2680, 1994.
    • (1994) J. Acoust. Soc. Amer. , vol.95 , pp. 2670-2680
    • Drullman, R.1    Festen, J.M.2    Plomp, R.3
  • 20
    • 0032676337 scopus 로고    scopus 로고
    • On the relative importance of various components of the modulation spectrum for automatic speech recognition
    • N. Kanedera, T. Arai, H. Hermansky, and M. Pavel, "On the relative importance of various components of the modulation spectrum for automatic speech recognition," Speech Commun., vol.28, no.1, pp. 43-55, 1999.
    • (1999) Speech Commun , vol.28 , Issue.1 , pp. 43-55
    • Kanedera, N.1    Arai, T.2    Hermansky, H.3    Pavel, M.4
  • 21
    • 0038694713 scopus 로고    scopus 로고
    • The analysis of speech in different temporal integration windows: Cerebral lateralization as asymmetric sampling in time
    • D. Poeppel, "The analysis of speech in different temporal integration windows: Cerebral lateralization as asymmetric sampling in time," Speech Commun., vol.41, pp. 245-255, 2003.
    • (2003) Speech Commun , vol.41 , pp. 245-255
    • Poeppel, D.1
  • 22
    • 51449116408 scopus 로고    scopus 로고
    • Local peak enhancement combined with noise reduction algorithms for robust automatic speech recognition in automobiles
    • O. Ichikawa, T. Fukuda, and M. Nishimura, "Local peak enhancement combined with noise reduction algorithms for robust automatic speech recognition in automobiles," in Proc. IEEE ICASSP, 2008, pp. 4865-4868.
    • (2008) Proc. IEEE ICASSP , pp. 4865-4868
    • Ichikawa, O.1    Fukuda, T.2    Nishimura, M.3
  • 25
    • 0037401288 scopus 로고    scopus 로고
    • Toward improving speech detection robustness for speech recognition in adverse environment
    • L. Karray and A. Martin, "Toward improving speech detection robustness for speech recognition in adverse environment," Speech Commun., vol.40, pp. 261-276, 2003.
    • (2003) Speech Commun , vol.40 , pp. 261-276
    • Karray, L.1    Martin, A.2
  • 26
    • 44849131058 scopus 로고    scopus 로고
    • Censrec2: Corpus and evaluation environments for in car continuous digit speech recognition
    • S. Nakamura, M. Fujimoto, and K. Takeda, "Censrec2: Corpus and evaluation environments for in car continuous digit speech recognition," Proc. Interspeech, pp. 2330-2333, 2006.
    • (2006) Proc. Interspeech , pp. 2330-2333
    • Nakamura, S.1    Fujimoto, M.2    Takeda, K.3


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.