SCOPUS 정보 검색 플랫폼

Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH

Volumn , Issue , 2014, Pages 1534-1538

Boosted deep neural networks and multi-resolution cochleagram features for voice activity detection

(2) Zhang, Xiao Lei a Wang, DeLiang b

a TSINGHUA UNIVERSITY (China)

b The Ohio State University (United States)

Author keywords

Boosting; Cochleagram; Deep neural network; MRCG; Voice activity detection

Indexed keywords

COMPUTATIONAL LINGUISTICS; FORECASTING; SPEECH COMMUNICATION; SPEECH PROCESSING;

BOOSTING; COCHLEAGRAM; DEEP NEURAL NETWORKS; MRCG; VOICE ACTIVITY DETECTION;

SPEECH RECOGNITION;

EID: 84910097441 PISSN: 2308457X EISSN: 19909772 Source Type: Conference Proceeding
DOI: None Document Type: Conference Paper

Times cited : (61)

References (25)

1
- 79959828814
- Deep-structured hidden conditional random fields for phonetic recognition
- D. Yu and L. Deng, "Deep-structured hidden conditional random fields for phonetic recognition, " in Proc. Inter Speech, 2010, pp. 2986-2989.
- (2010) Proc. Inter Speech , pp. 2986-2989
- Yu, D.¹ Deng, L.²

2
- 84906277631
- Multiband long-term signal variability features for robust voice activity detection
- A. Tsiartas, T. Chaspari, N. Katsamanis, P. Ghosh, M. Li, M. Van Segbroeck, A. Potamianos, and S. S. Narayanan, "Multiband long-term signal variability features for robust voice activity detection, " in Proc. Inter Speech, 2013, pp. 718-722.
- (2013) Proc. Inter Speech , pp. 718-722
- Tsiartas, A.¹ Chaspari, T.² Katsamanis, N.³ Ghosh, P.⁴ Li, M.⁵ Segbroeck, M.V.⁶ Potamianos, A.⁷ Narayanan, S.S.⁸

3
- 0032762471
- A statistical model-based voice activity detection
- J. Sohn, N. S. Kim, andW. Sung, "A statistical model-based voice activity detection, " IEEE Signal Process. Lett., vol. 6, no. 1, pp. 1-3, 1999.
- (1999) IEEE Signal Process. Lett. , vol.6 , Issue.1 , pp. 1-3
- Sohn, J.¹ Kim, N.S.² Sung, A.³

4
- 80052045343
- Convex combination of multiple statistical models with application to vad
- T. Petsatodis, C. Boukis, F. Talantzis, Z. Tan, and R. Prasad, "Convex combination of multiple statistical models with application to vad, " IEEE Trans. Audio, Speech, Lang. Process., vol. 19, no. 8, pp. 2314-2327, 2011.
- (2011) IEEE Trans. Audio, Speech, Lang. Process. , vol.19 , Issue.8 , pp. 2314-2327
- Petsatodis, T.¹ Boukis, C.² Talantzis, F.³ Tan, Z.⁴ Prasad, R.⁵

5
- 77950091897
- Voice activity detection based on statistical models and machine learning approaches
- J. W. Shin, J. H. Chang, and N. S. Kim, "Voice activity detection based on statistical models and machine learning approaches, " Computer Speech & Lang., vol. 24, no. 3, pp. 515-530, 2010.
- (2010) Computer Speech & Lang. , vol.24 , Issue.3 , pp. 515-530
- Shin, J.W.¹ Chang, J.H.² Kim, N.S.³

6
- 79959838316
- Voice activity detection based on conditional random fields using multiple features
- A. Saito, Y. Nankaku, A. Lee, and K. Tokuda, "Voice activity detection based on conditional random fields using multiple features." in Proc. Inter speech, 2010, pp. 2086-2089.
- (2010) Proc. Inter Speech , pp. 2086-2089
- Saito, A.¹ Nankaku, Y.² Lee, A.³ Tokuda, K.⁴

7
- 84875828442
- Voice activity detection via noise reducing using non-negative sparse coding
- P. Teng and Y. Jia, "Voice activity detection via noise reducing using non-negative sparse coding, " IEEE Signal Process. Lett., vol. 20, no. 5, pp. 475-478, 2013.
- (2013) IEEE Signal Process. Lett. , vol.20 , Issue.5 , pp. 475-478
- Teng, P.¹ Jia, Y.²

8
- 84910100905
- Voice activity detection in presence of transient noise using spectral clustering
- S. Mousazadeh and I. Cohen, "Voice activity detection in presence of transient noise using spectral clustering." IEEE Trans. Audio, Speech, Lang. Process., vol. 21, no. 6, pp. 1261-1271, 2013.
- (2013) IEEE Trans. Audio, Speech, Lang. Process. , vol.21 , Issue.6 , pp. 1261-1271
- Mousazadeh, S.¹ Cohen, I.²

9
- 77956289831
- Discriminative training for multiple observation likelihood ratio based voice activity detection
- T. Yu and J. H. L. Hansen, "Discriminative training for multiple observation likelihood ratio based voice activity detection, " IEEE Signal Process. Lett., vol. 17, no. 11, pp. 897-900, 2010.
- (2010) IEEE Signal Process. Lett. , vol.17 , Issue.11 , pp. 897-900
- Yu, T.¹ Hansen, J.H.L.²

10
- 80053614636
- Voice activity detection based on an unsupervised learning framework
- D. Ying, Y. Yan, J. Dang, and F. Soong, "Voice activity detection based on an unsupervised learning framework, " IEEE Trans. Audio, Speech, Lang. Process., vol. 19, no. 8, pp. 2624-2644, 2011.
- (2011) IEEE Trans. Audio, Speech, Lang. Process. , vol.19 , Issue.8 , pp. 2624-2644
- Ying, D.¹ Yan, Y.² Dang, J.³ Soong, F.⁴

11
- 85008579584
- Multiple acoustic model-based discriminative likelihood ratio weighting for voice activity detection
- Y. Suh and H. Kim, "Multiple acoustic model-based discriminative likelihood ratio weighting for voice activity detection, " IEEE Signal Process. Lett., vol. 19, no. 8, pp. 507-510, 2012.
- (2012) IEEE Signal Process. Lett. , vol.19 , Issue.8 , pp. 507-510
- Suh, Y.¹ Kim, H.²

12
- 84890490765
- Robust front-end processing for speaker identification over extremely degraded communication channels
- S. O. Sadjadi and J. H. Hansen, "Robust front-end processing for speaker identification over extremely degraded communication channels, " in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process., 2013, pp. 7214-7218.
- (2013) Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. , pp. 7214-7218
- Sadjadi, S.O.¹ Hansen, J.H.²

13
- 84890484287
- Recurrent neural networks for voice activity detection
- T. Hughes and K. Mierle, "Recurrent neural networks for voice activity detection, " in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process., 2013, pp. 7378-7382.
- (2013) Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. , pp. 7378-7382
- Hughes, T.¹ Mierle, K.²

14
- 84872300403
- Deep belief networks based voice activity detection
- X.-L. Zhang and J. Wu, "Deep belief networks based voice activity detection, " IEEE Trans. Audio, Speech, Lang. Process., vol. 21, no. 4, pp. 697-710, 2013.
- (2013) IEEE Trans. Audio, Speech, Lang. Process. , vol.21 , Issue.4 , pp. 697-710
- Zhang, X.-L.¹ Wu, J.²

15
- 84906228076
- Speech activity detection on youtube using deep neural networks
- N. Ryant, M. Liberman, and J. Yuan, "Speech activity detection on youtube using deep neural networks, " in Proc. Inter Speech, 2013, pp. 728-731.
- (2013) Proc. Inter Speech , pp. 728-731
- Ryant, N.¹ Liberman, M.² Yuan, J.³

16
- 84905233552
- A feature study for classification-based speech separation at very low signal-to-noise ratio
- in press
- J. Chen, Y. Wang, and D. L. Wang, "A feature study for classification-based speech separation at very low signal-to-noise ratio, " in Proc. Int. Conf. Acoust., Speech, Signal Process., 2014, in press.
- (2014) Proc. Int. Conf. Acoust., Speech, Signal Process.
- Chen, J.¹ Wang, Y.² Wang, D.L.³

17
- 84910032338
- Aurora working group: DSR front end LVCSR evaluation AU/384/02
- State Univ. Tech. Rep
- D. Pearce and J. Picone, "Aurora working group: DSR front end LVCSR evaluation AU/384/02, " Inst. for Signal & Inform. Process., Mississippi State Univ., Tech. Rep., 2002.
- (2002) Inst. for Signal & Inform. Process., Mississippi
- Pearce, D.¹ Picone, J.²

18
- 80053403826
- Ensemble methods in machine learning
- T. G. Dietterich, "Ensemble methods in machine learning, " Multiple Classifier Sys., pp. 1-15, 2000.
- (2000) Multiple Classifier Sys. , pp. 1-15
- Dietterich, T.G.¹

19
- 84890527827
- Improving deep neural networks for LVCSR using rectified linear units and dropout
- G. E. Dahl, T. N. Sainath, and G. E. Hinton, "Improving deep neural networks for LVCSR using rectified linear units and dropout, " in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process., 2013, pp. 8609-8613.
- (2013) Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. , pp. 8609-8613
- Dahl, G.E.¹ Sainath, T.N.² Hinton, G.E.³

20
- 84877760312
- Large scale distributed deep networks
- J. Dean, G. Corrado, R. Monga, K. Chen, M. Devin, Q. V. Le, M. Z. Mao, M. Ranzato, A. W. Senior, P. A. Tucker et al., "Large scale distributed deep networks." in Adv. Neural Inform. Process. Sys., 2012, pp. 1232-1240.
- (2012) Adv. Neural Inform. Process. Sys. , pp. 1232-1240
- Dean, J.¹ Corrado, G.² Monga, R.³ Chen, K.⁴ Devin, M.⁵ Le, Q.V.⁶ Mao, M.Z.⁷ Ranzato, M.⁸ Senior, W.A.⁹ Tucker, P.A.¹⁰

21
- 84897510162
- On the importance of initialization and momentum in deep learning
- I. Sutskever, J. Martens, G. Dahl, and G. Hinton, "On the importance of initialization and momentum in deep learning, " in Proc. Int. Conf. Machine Learn., 2013, pp. 1-8.
- (2013) Proc. Int. Conf. Machine Learn. , pp. 1-8
- Sutskever, I.¹ Martens, J.² Dahl, G.³ Hinton, G.⁴

22
- 38849102154
- Auditory segmentation based on onset and offset analysis
- G. Hu and D. L. Wang, "Auditory segmentation based on onset and offset analysis, " IEEE Trans. Audio, Speech, Lang. Process., vol. 15, no. 2, pp. 396-405, 2007.
- (2007) IEEE Trans. Audio, Speech, Lang. Process. , vol.15 , Issue.2 , pp. 396-405
- Hu, G.¹ Wang, D.L.²

23
- 84871829474
- A multi stream feature framework based on bandpass modulation filtering for robust speech recognition
- S. K. Nemala, K. Patil, and M. Elhilali, "A multistream feature framework based on bandpass modulation filtering for robust speech recognition, " IEEE Trans. Audio, Speech, Lang. Process., vol. 21, no. 2, pp. 416-426, 2013.
- (2013) IEEE Trans. Audio, Speech, Lang. Process. , vol.21 , Issue.2 , pp. 416-426
- Nemala, S.K.¹ Patil, K.² Elhilali, M.³

24
- 71049180205
- Computational auditory scene analysis: Principles
- Wiley-IEEE Press
- D. L.Wang and G. J. Brown, Computational Auditory Scene Analysis: Principles, Algorithms and Applications. Wiley-IEEE Press, 2006.
- (2006) Algorithms and Applications
- Wang, D.L.¹ Brown, G.J.²

25
- 23344452899
- Statistical voice activity detection using a multiple observation likelihood ratio test
- J. Ramírez, J. C. Segura, C. Benítez, L. Garciá, and A. Rubio, "Statistical voice activity detection using a multiple observation likelihood ratio test, " IEEE Signal Process. Lett., vol. 12, no. 10, pp. 689-692, 2005.
- (2005) IEEE Signal Process. Lett. , vol.12 , Issue.10 , pp. 689-692
- Ramírez, J.¹ Segura, J.C.² Benítez, C.³ Garciá, L.⁴ Rubio, A.⁵

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.