메뉴 건너뛰기




Volumn , Issue , 2011, Pages 2645-2648

Robust voice activity detector for real world applications using harmonicity and modulation frequency

Author keywords

Harmonicity; Human robot interaction; Modulation frequency; Voice activity detection

Indexed keywords

AUTOMATIC SPEECH RECOGNITION; DIFFERENT FREQUENCY; DYNAMIC NOISE; FALSE ALARM RATE; HARMONICITY; LOW-SNR ENVIRONMENT; MISS-RATE; MODULATION FREQUENCIES; NUMBER OF FALSE ALARMS; REAL-WORLD APPLICATION; SPECTRAL ENTROPY; SYSTEM LEVELS; VOICE ACTIVITY DETECTION; VOICE ACTIVITY DETECTORS;

EID: 84865802934     PISSN: None     EISSN: 19909772     Source Type: Conference Proceeding    
DOI: None     Document Type: Conference Paper
Times cited : (20)

References (18)
  • 1
    • 77955779523 scopus 로고    scopus 로고
    • A voice-commandable robotic forklift working alongside humans in minimally-prepared outdoor environments
    • S. Teller, M. Walter et al., "A voice-commandable robotic forklift working alongside humans in minimally-prepared outdoor environments," in Proc. IEEE Int. Conf. on Robotics and Automation, 2010.
    • (2010) Proc. IEEE Int. Conf. on Robotics and Automation
    • Teller, S.1    Walter, M.2
  • 2
    • 67649122014 scopus 로고    scopus 로고
    • Energy-based VAD with grey magnitude spectral subtraction
    • C. Hsieh, T. Feng, and P. Huang, "Energy-based VAD with grey magnitude spectral subtraction," Speech Communication, vol. 51, no. 9, pp. 810-819, 2009.
    • (2009) Speech Communication , vol.51 , Issue.9 , pp. 810-819
    • Hsieh, C.1    Feng, T.2    Huang, P.3
  • 5
    • 0026907622 scopus 로고
    • Voice activity detection using a periodicity measure
    • R. Tucker, "Voice activity detection using a periodicity measure," in Proc. IEEE, vol. 139, 1992, pp. 377 -380.
    • (1992) Proc. IEEE , vol.139 , pp. 377-380
    • Tucker, R.1
  • 6
    • 34047272330 scopus 로고    scopus 로고
    • Discrimination of speech from nonspeech based on multiscale spectro-temporal modulations
    • N. Mesgarani, M. Slaney, and S. Shamma, "Discrimination of speech from nonspeech based on multiscale spectro-temporal modulations," in IEEE trans. on Audio, Speech, and Language processing, vol. 14, 2006, pp. 920-930.
    • (2006) IEEE Trans. on Audio, Speech, and Language Processing , vol.14 , pp. 920-930
    • Mesgarani, N.1    Slaney, M.2    Shamma, S.3
  • 7
    • 70450170882 scopus 로고    scopus 로고
    • Temporal modulation processing of speech signals for noise robust ASR
    • H. You and A. Alwa, "Temporal modulation processing of speech signals for noise robust ASR," in Interspeech, 2009, pp. 36-39.
    • (2009) Interspeech , pp. 36-39
    • You, H.1    Alwa, A.2
  • 8
    • 78049402270 scopus 로고    scopus 로고
    • Modulation-based detection of speech in real background noise: Generalization to novel background classes
    • J. Bach, B. Kollmeier, and J. Anemuller, "Modulation-based detection of speech in real background noise: Generalization to novel background classes," in ICASSP, 2010, pp. 41-44.
    • (2010) ICASSP , pp. 41-44
    • Bach, J.1    Kollmeier, B.2    Anemuller, J.3
  • 9
    • 0001835850 scopus 로고
    • Accurate short-term analysis of the fundamental frequency and the harmonics-to-noise ratio of a sampled sound
    • P. Boersma, "Accurate short-term analysis of the fundamental frequency and the harmonics-to-noise ratio of a sampled sound," in Proc. the Institute of Phonetic Sciences, 1993, pp. 97-110.
    • (1993) Proc. The Institute of Phonetic Sciences , pp. 97-110
    • Boersma, P.1
  • 10
    • 0027957839 scopus 로고
    • Effect of temporal envelope smearing on speech reception
    • R. Drullman, J. Festen, and R. Plomp, "Effect of temporal envelope smearing on speech reception," JASA, vol. 95, pp. 1053-1064, 1994.
    • (1994) JASA , vol.95 , pp. 1053-1064
    • Drullman, R.1    Festen, J.2    Plomp, R.3
  • 13
    • 0032762471 scopus 로고    scopus 로고
    • A statistical model-based voice activity detection
    • J. Sohn, N. Kim, and W. Sung, "A statistical model-based voice activity detection," IEEE Signal Processing Letters, vol. 6, pp. 1-3, 1999.
    • (1999) IEEE Signal Processing Letters , vol.6 , pp. 1-3
    • Sohn, J.1    Kim, N.2    Sung, W.3
  • 14
    • 79951799250 scopus 로고    scopus 로고
    • Spoken command of large mobile robots in outdoor environments
    • E. Chuangsuwanich, S. Cyphers et al., "Spoken command of large mobile robots in outdoor environments," in SLT, 2010.
    • (2010) SLT
    • Chuangsuwanich, E.1    Cyphers, S.2
  • 15
    • 45549095163 scopus 로고    scopus 로고
    • PocketSUMMIT: Small-footprint continuous speech recognition
    • I. Hetherington, "PocketSUMMIT: Small-footprint continuous speech recognition," in Proc. Interspeech, 2007.
    • (2007) Proc. Interspeech
    • Hetherington, I.1
  • 16
    • 85017011801 scopus 로고    scopus 로고
    • Voice activity detector (VAD) for adaptive multi-rate (AMR) speech traffic channels
    • ETSI
    • ETSI, "Voice activity detector (VAD) for adaptive multi-rate (AMR) speech traffic channels," ETSI EN 301 708, 1999.
    • (1999) ETSI en 301 708
  • 17
    • 77957272576 scopus 로고    scopus 로고
    • Speech processing, transmission and quality aspects (STQ); Distributed speech recognition; Advanced front-end feature extraction algorithm; Compression algorithms dsr advanced front end
    • ETSI
    • ETSI, "Speech processing, transmission and quality aspects (STQ); distributed speech recognition; advanced front-end feature extraction algorithm; compression algorithms dsr advanced front end," ETSI ES 202 050, 2007.
    • (2007) ETSI ES 202 050
  • 18
    • 0031238211 scopus 로고    scopus 로고
    • ITU-T recommendation G.729 annex B: A silence compression scheme for use with G.729 optimized for V.70 digital simultaneous voice and data applications
    • Sept
    • A. Benyassine, E. Shlomot et al., "ITU-T recommendation G.729 annex B: a silence compression scheme for use with G.729 optimized for V.70 digital simultaneous voice and data applications," Communications Magazine, IEEE, vol. 35, pp. 64 -73, Sept 1997.
    • (1997) Communications Magazine, IEEE , vol.35 , pp. 64-73
    • Benyassine, A.1    Shlomot, E.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.