SCOPUS 정보 검색 플랫폼

ITRW on Statistical and Perceptual Audio Processing, SAPA 2006

Volumn , Issue , 2006, Pages 65-70

Study of Noise Robust Voice Activity Detection Based on Periodic Component to Aperiodic Component Ratio

(2) Ishizuka, Kentaro a Nakatani, Tomohiro a

a NTT Communication Science Laboratories (Japan)

Author keywords

[No Author keywords available]

Indexed keywords

ACOUSTIC NOISE; AUDITION; SPEECH RECOGNITION;

APERIODIC COMPONENTS; APRIORI; ENVIRONMENTAL SOUNDS; NOISE ROBUST; NON-STATIONARITIES; PERIODIC COMPONENTS; REAL-WORLD; SOUND CHANGE; VOICE ACTIVITY DETECTION ALGORITHMS; VOICE-ACTIVITY DETECTIONS;

SIGNAL TO NOISE RATIO;

EID: 85133167678 PISSN: None EISSN: None Source Type: Conference Proceeding
DOI: None Document Type: Conference Paper

Times cited : (26)

References (26)

1
- 0029290274
- Study of voice activity detector and its influence on a noise reduction system
- Le Bouquin-Jeannès R. and Faucon, G. “Study of voice activity detector and its influence on a noise reduction system,” Speech Communication, 16, 245-254, 1995.
- (1995) Speech Communication , vol.16 , pp. 245-254
- Le Bouquin-Jeannès, R.¹ Faucon, G.²

2
- 84889333873
- Voice activity detection for cellular networks
- Srinivasan, K. and Gersho, A. “Voice activity detection for cellular networks,” Proc. of IEEE Workshop on Speech Coding for Telecommunications, 85-86, 1993.
- (1993) Proc. of IEEE Workshop on Speech Coding for Telecommunications , pp. 85-86
- Srinivasan, K.¹ Gersho, A.²

3
- 0028461861
- A robust algorithm for word boundary detection in the presence of noise
- Junqua, J.-C., Mak, B., and Reaves, B. “A robust algorithm for word boundary detection in the presence of noise,” IEEE Trans. Speech Audio Process., 2, 406-412, 1994.
- (1994) IEEE Trans. Speech Audio Process , vol.2 , pp. 406-412
- Junqua, J.-C.¹ Mak, B.² Reaves, B.³

4
- 0037401288
- Towards improving speech detection robustness for speech recognition in adverse conditions
- Karray, L. and Martin, A. “Towards improving speech detection robustness for speech recognition in adverse conditions,” Speech Communication, 40, 261-276, 2003.
- (2003) Speech Communication , vol.40 , pp. 261-276
- Karray, L.¹ Martin, A.²

5
- 0036476655
- Speech pause detection for noise spectrum estimation by tracking power envelope dynamics
- Marzinzik, M. and Kollmeier, B. “Speech pause detection for noise spectrum estimation by tracking power envelope dynamics,” IEEE Trans. Speech Audio Process., 10, 109-118, 2002.
- (2002) IEEE Trans. Speech Audio Process , vol.10 , pp. 109-118
- Marzinzik, M.¹ Kollmeier, B.²

6
- 0032762471
- A statistical model-based voice activity detection
- Sohn, J., Kim, N.-S., and Sung, W. “A statistical model-based voice activity detection,” IEEE Signal Process. Lett., 6, 1-3, 1999.
- (1999) IEEE Signal Process. Lett , vol.6 , pp. 1-3
- Sohn, J.¹ Kim, N.-S.² Sung, W.³

7
- 0141702200
- A linked-HMM model for robust voicing and speech detection
- Basu, S. “A linked-HMM model for robust voicing and speech detection,” Proc. ICASSP, 1, 816-819, 2003.
- (2003) Proc. ICASSP , vol.1 , pp. 816-819
- Basu, S.¹

8
- 0016470107
- An algorithm for determining the endpoints of isolated utterances
- Rabiner, L. R. and Sambur, M. R. “An algorithm for determining the endpoints of isolated utterances,” The Bell Syst. Tech. Journal, 54, 297-315, 1975.
- (1975) The Bell Syst. Tech. Journal , vol.54 , pp. 297-315
- Rabiner, L. R.¹ Sambur, M. R.²

9
- 0016962193
- A pattern recognition approach to voiced-unvoiced-silence classification with applications to speech recognition
- Atal, B. S. and Rabiner, L. R. “A pattern recognition approach to voiced-unvoiced-silence classification with applications to speech recognition,” IEEE Trans. Acoust., Speech, and Signal Process., ASSP-24, 201-212, 1976.
- (1976) IEEE Trans. Acoust., Speech, and Signal Process , vol.ASSP-24 , pp. 201-212
- Atal, B. S.¹ Rabiner, L. R.²

10
- 17344389852
- Robust speech recognition in noisy environments: The 2001 IBM SPINE evaluation system
- Kingsbury, B., Saon, G., Mangu, L., Padmanabhan, M., and Sarikaya, R. “Robust speech recognition in noisy environments: The 2001 IBM SPINE evaluation system,” Proc. ICASSP, 1, 53-56, 2002.
- (2002) Proc. ICASSP , vol.1 , pp. 53-56
- Kingsbury, B.¹ Saon, G.² Mangu, L.³ Padmanabhan, M.⁴ Sarikaya, R.⁵

11
- 33745218538
- Voicing features for robust speech detection
- Kristjansson, T., Deligne, S., and Olsen, P. “Voicing features for robust speech detection,” Proc. Interspeech, 369-372, 2005.
- (2005) Proc. Interspeech , pp. 369-372
- Kristjansson, T.¹ Deligne, S.² Olsen, P.³

12
- 77951493947
- Robust entropy-based endpoint detection for speech recognition in noisy environments
- Shen, J.-L., Hung, J.-W., and Lee, L.-S. “Robust entropy-based endpoint detection for speech recognition in noisy environments,” Proc. ICSLP, 1998.
- (1998) Proc. ICSLP
- Shen, J.-L.¹ Hung, J.-W.² Lee, L.-S.³

13
- 11144286121
- The spectral autocorrelation peak valley ratio (SAPVR) - A usable speech measure employed as a co-channel detection system
- Yantorno, R. E., Krishnamachari, K. L., and Lovekin, J. M. “The spectral autocorrelation peak valley ratio (SAPVR) - A usable speech measure employed as a co-channel detection system,” Proc. IEEE Int. Workshop Intell. Signal Process., 2001.
- (2001) Proc. IEEE Int. Workshop Intell. Signal Process
- Yantorno, R. E.¹ Krishnamachari, K. L.² Lovekin, J. M.³

14
- 0026907622
- Voice activity detection using a periodicity measure
- Tucker, R. “Voice activity detection using a periodicity measure,” IEE Proceedings-I, 139, 377-380, 1992.
- (1992) IEE Proceedings-I , vol.139 , pp. 377-380
- Tucker, R.¹

15
- 0006695719
- ITU-T Recommendation G.729 Annex B., 1996.
- (1996) ITU-T Recommendation G.729 Annex B

16
- 85133205522
- ETSI standard document, ETSI ES 202 050 V1.1.3
- ETSI standard document, ETSI ES 202 050 V1.1.3., 2003.
- (2003)

17
- 27644475276
- An improved voice activity detection using higher order statistics
- Li, K., Swamy, N. S., and Ahmad, M. O. “An improved voice activity detection using higher order statistics,” IEEE Trans. Speech Audio Process., 13, 965-974, 2005.
- (2005) IEEE Trans. Speech Audio Process , vol.13 , pp. 965-974
- Li, K.¹ Swamy, N. S.² Ahmad, M. O.³

18
- 85009070560
- Noise robust real world spoken dialogue system using GMM based rejection of unintended inputs
- Lee, A., Nakamura, K., Nisimura, R., Saruwatari, H., and Shikano K. “Noise robust real world spoken dialogue system using GMM based rejection of unintended inputs,” Proc. Interspeech, 1, 173-176, 2004.
- (2004) Proc. Interspeech , vol.1 , pp. 173-176
- Lee, A.¹ Nakamura, K.² Nisimura, R.³ Saruwatari, H.⁴ Shikano, K.⁵

19
- 0027298253
- Separation of concurrent harmonic sounds: Fundamental frequency estimation and a time-domain cancellation model of auditory processing
- de Cheveigné, A. “Separation of concurrent harmonic sounds: Fundamental frequency estimation and a time-domain cancellation model of auditory processing,” J. Acoust. Soc. Am., 93, 3271-3290, 1993.
- (1993) J. Acoust. Soc. Am , vol.93 , pp. 3271-3290
- de Cheveigné, A.¹

20
- 0025544510
- Spectral modeling and synthesis: A sound analysis/synthesis system based on a deterministic plus stochastic decomposition
- Serra, X. and Smith, J. “Spectral modeling and synthesis: A sound analysis/synthesis system based on a deterministic plus stochastic decomposition,” Comp. Music J., 14, 1990.
- (1990) Comp. Music J , vol.14
- Serra, X.¹ Smith, J.²

21
- 0031647649
- An iterative algorithm for decomposition of speech signals into periodic and aperiodic components
- Yegnanarayana, B., d'Alessandro, C., and Darsinos, V. “An iterative algorithm for decomposition of speech signals into periodic and aperiodic components,” IEEE Trans. Speech Audio Process., 6, 1-11, 1998.
- (1998) IEEE Trans. Speech Audio Process , vol.6 , pp. 1-11
- Yegnanarayana, B.¹ d'Alessandro, C.² Darsinos, V.³

22
- 0002364817
- Modification of the aperiodic component of speech signals for synthesis
- van Santen, J. P. H., Sproat, R. W., Olive, J. P., and Hirschberg, J. Eds. New-York: Springer-Verlag
- Richard, G. and d'Alessandro, C. “Modification of the aperiodic component of speech signals for synthesis,” in Progress in Speech Synthesis, van Santen, J. P. H., Sproat, R. W., Olive, J. P., and Hirschberg, J. Eds. New-York: Springer-Verlag, 41-56, 1996.
- (1996) Progress in Speech Synthesis , pp. 41-56
- Richard, G.¹ d'Alessandro, C.²

23
- 85009168054
- Covariation and weighting of harmonically decomposed streams for ASR
- Jackson, P. J. B., Moreno, D. M., Russell, M. J. and Hernando, J. “Covariation and weighting of harmonically decomposed streams for ASR,” Proc. Interspeech, 2321-2324, 2003.
- (2003) Proc. Interspeech , pp. 2321-2324
- Jackson, P. J. B.¹ Moreno, D. M.² Russell, M. J.³ Hernando, J.⁴

24
- 33745738849
- Speech feature extraction method using subband-based periodicity and nonperiodicity decomposition
- Ishizuka, K., Nakatani, T., Minami, Y., and Miyazaki, N. “Speech feature extraction method using subband-based periodicity and nonperiodicity decomposition,” J. Acoust. Soc. Am., 120, 443-452, 2006.
- (2006) J. Acoust. Soc. Am , vol.120 , pp. 443-452
- Ishizuka, K.¹ Nakatani, T.² Minami, Y.³ Miyazaki, N.⁴

25
- 11144332020
- Robust and accurate fundamental frequency estimation based on dominant harmonic components
- Nakatani, T. and Irino, T., “Robust and accurate fundamental frequency estimation based on dominant harmonic components,” J. Acoust. Soc. Am., 116, 3690-3700, 2004.
- (2004) J. Acoust. Soc. Am , vol.116 , pp. 3690-3700
- Nakatani, T.¹ Irino, T.²

26
- 24144494616
- AURORA-2J: An evaluation framework for Japanese noisy speech recognition
- Nakamura, S., Takeda, K., Yamamoto, K., Yamada, T., Kuroiwa, S., Kitaoka, N., Nishiura, T., Sasou, A., Mizumachi, M., Miyajima, C., Fujimoto, M., and Endo, T. “AURORA-2J: An evaluation framework for Japanese noisy speech recognition,” IEICE Trans. on Inf. & Syst., E88-D, 535-544, 2005.
- (2005) IEICE Trans. on Inf. & Syst , vol.E88-D , pp. 535-544
- Nakamura, S.¹ Takeda, K.² Yamamoto, K.³ Yamada, T.⁴ Kuroiwa, S.⁵ Kitaoka, N.⁶ Nishiura, T.⁷ Sasou, A.⁸ Mizumachi, M.⁹ Miyajima, C.¹⁰ Fujimoto, M.¹¹ Endo, T.¹²

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.