SCOPUS 정보 검색 플랫폼

IEEE Transactions on Audio, Speech and Language Processing

Volumn 21, Issue 4, 2013, Pages 697-710

Deep belief networks based voice activity detection

(2) Zhang, Xiao Lei a Wu, Ji a

a TSINGHUA UNIVERSITY (China)

Author keywords

Deep learning; information fusion; voice activity detection

Indexed keywords

ACOUSTIC FEATURES; DEEP BELIEF NETWORKS; DEEP LEARNING; EMPIRICAL COMPARISON; GENERATIVE MODEL; HIDDEN LAYERS; INPUT LAYERS; LINEAR CLASSIFIERS; MACHINE-LEARNING; MULTIPLE FEATURE FUSION; MULTIPLE FEATURES; PERFORMANCE ANALYSIS; REAL-TIME DETECTION; VOICE ACTIVITY DETECTION;

BAYESIAN NETWORKS; FEATURE EXTRACTION; INFORMATION FUSION; LEARNING SYSTEMS;

SPEECH RECOGNITION;

EID: 84872300403 PISSN: 15587916 EISSN: None Source Type: Journal
DOI: 10.1109/TASL.2012.2229986 Document Type: Article

Times cited : (319)

References (74)

1
- 0031238211
- ITU-T recommendation G.729 annex B: A silence compression scheme for use with G.729 optimized for V.70 digital simultaneous voice and data applications
- A. Benyassine, E. Shlomot, H. Y. Su, D.Massaloux, C. Lamblin, and J. P. Petit, "ITU-T recommendation G. 729 Annex B: A silence compression scheme for use with G. 729 optimized for V. 70 digital simultaneous voice and data applications," IEEE Commun. Mag., vol. 35, no. 9, pp. 64-73, Sep. 1997. (Pubitemid 127557050)
- (1997) IEEE Communications Magazine , vol.35 , Issue.9 , pp. 64-73
- Benyassine, A.¹ Shlomot, E.² Su, H.-Y.³ Massaloux, D.⁴ Lamblin, C.⁵ Petit, J.-P.⁶

2
- 77957272576
- Speech processing, transmission and quality aspects (STQ); Distributed speech recognition; Advanced front-end feature extraction algorithm; Compression algorithms
- "Speech processing, transmission and quality aspects (STQ); distributed speech recognition; advanced front-end feature extraction algorithm; compression algorithms," ETSI ES, vol. 202, no. 050.
- ETSI ES , vol.202 , Issue.50

3
- 84869416544
- Towards generalizing classification based speech separation
- Jan.
- K. Han and D. L. Wang, "Towards generalizing classification based speech separation," IEEE Trans. Audio, Speech, Lang. Process., vol. 21, no. 1, pp. 1-27, Jan. 2012.
- (2012) IEEE Trans. Audio, Speech, Lang. Process , vol.21 , Issue.1 , pp. 1-27
- Han, K.¹ Wang, D.L.²

4
- 79959840616
- Investigation of full-sequence training of deep belief networks for speech recognition
- A. Mohamed, D. Yu, and L. Deng, "Investigation of full-sequence training of deep belief networks for speech recognition," in Proc. Interspeech-10, 2010, pp. 2846-2849.
- (2010) Proc. Interspeech-10 , pp. 2846-2849
- Mohamed, A.¹ Yu, D.² Deng, L.³

5
- 79959828814
- Deep-structured hidden conditional random fields for phonetic recognition
- D. Yu and L. Deng, "Deep-structured hidden conditional random fields for phonetic recognition," in Proc. Interspeech-10, 2010, pp. 2986-2989.
- (2010) Proc. Interspeech-10 , pp. 2986-2989
- Yu, D.¹ Deng, L.²

6
- 84055211743
- Acoustic modeling using deep belief networks
- Jan.
- A. Mohamed, G. Dahl, and G. Hinton, "Acoustic modeling using deep belief networks," IEEE Trans. Audio, Speech, Lang. Process., vol. 20, no. 1, pp. 14-22, Jan. 2012.
- (2012) IEEE Trans. Audio, Speech, Lang. Process , vol.20 , Issue.1 , pp. 14-22
- Mohamed, A.¹ Dahl, G.² Hinton, G.³

7
- 84055222005
- Context-dependent pre-trained deep neural networks for large vocabulary speech recognition
- G.Dahl, D.Yu, L.Deng, andA.Acero, "Context-dependent pre-trained deep neural networks for large vocabulary speech recognition," IEEE Trans. Audio, Speech, Lang. Process., vol. 20, no. 1, pp. 30-42, 2012.
- (2012) IEEE Trans. Audio, Speech, Lang. Process , vol.20 , Issue.1 , pp. 30-42
- Dahl, G.¹ Yu, D.² Deng, L.³ Acero, A.⁴

8
- 84867131826
- Conversational speech transcription using context-dependent deep neural networks
- D. Yu, F. Seide, and G. Li, "Conversational speech transcription using context-dependent deep neural networks," in Proc. 29th Int. Conf. Mach. Learn., 2012, pp. 1-2.
- (2012) Proc. 29th Int. Conf. Mach. Learn , pp. 1-2
- Yu, D.¹ Seide, F.² Li, G.³

9
- 84867732862
- Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups
- Nov.
- G. Hinton et al., "Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups," IEEE Signal Process. Mag., vol. 11, no. 3, pp. 229-241, Nov. 2012.
- (2012) IEEE Signal Process. Mag , vol.11 , Issue.3 , pp. 229-241
- Hinton, G.¹

10
- 51449114537
- Applying support vector machines to voice activity detection
- D. Enqing, L. Guizhong, Z. Yatong, and Z. Xiaodi, "Applying support vector machines to voice activity detection," in Proc. Int. Conf. Signal Process., 2002, vol. 2, pp. 1124-1127.
- (2002) Proc. Int. Conf. Signal Process , vol.2 , pp. 1124-1127
- Enqing, D.¹ Guizhong, L.² Yatong, Z.³ Xiaodi, Z.⁴

11
- 67650137747
- Discriminative weight training for a statistical model-based voice activity detection
- S. I. Kang, Q. H. Jo, and J. H. Chang, "Discriminative weight training for a statistical model-based voice activity detection," IEEE Signal Process. Lett., vol. 15, pp. 170-173, 2008.
- (2008) IEEE Signal Process. Lett , vol.15 , pp. 170-173
- Kang, S.I.¹ Jo, Q.H.² Chang, J.H.³

12
- 65549106422
- Statistical model-based voice activity detection using support vector machine
- Q. H. Jo, J. H. Chang, J. W. Shin, and N. S. Kim, "Statistical model-based voice activity detection using support vector machine," IET Signal Process., vol. 3, no. 3, pp. 205-210, 2009.
- (2009) IET Signal Process , vol.3 , Issue.3 , pp. 205-210
- Jo, Q.H.¹ Chang, J.H.² Shin, J.W.³ Kim, N.S.⁴

13
- 77950091897
- Voice activity detection based on statisticalmodels andmachine learning approaches
- J. W. Shin, J. H. Chang, and N. S. Kim, "Voice activity detection based on statisticalmodels andmachine learning approaches," Comput. Speech Lang., vol. 24, no. 3, pp. 515-530, 2010.
- (2010) Comput. Speech Lang , vol.24 , Issue.3 , pp. 515-530
- Shin, J.W.¹ Chang, J.H.² Kim, N.S.³

14
- 77956289831
- Discriminative training for multiple observation likelihood ratio based voice activity detection
- T. Yu and J. H. L. Hansen, "Discriminative training for multiple observation likelihood ratio based voice activity detection," IEEE Signal Process. Lett., vol. 17, no. 11, pp. 897-900, 2010.
- (2010) IEEE Signal Process. Lett , vol.17 , Issue.11 , pp. 897-900
- Yu, T.¹ Hansen, J.H.L.²

15
- 79952611095
- Maximum margin clustering based statistical VAD with multiple observation compound feature
- J. Wu and X. L. Zhang, "Maximum margin clustering based statistical VAD with multiple observation compound feature," IEEE Signal Process. Lett., vol. 18, no. 5, pp. 283-286, 2011.
- (2011) IEEE Signal Process. Lett , vol.18 , Issue.5 , pp. 283-286
- Wu, J.¹ Zhang, X.L.²

16
- 79959756010
- Efficient multiple kernel support vector machine based voice activity detection
- J. Wu and X. L. Zhang, "Efficient multiple kernel support vector machine based voice activity detection," IEEE Signal Process. Lett., vol. 18, no. 8, pp. 466-499, 2011.
- (2011) IEEE Signal Process. Lett , vol.18 , Issue.8 , pp. 466-499
- Wu, J.¹ Zhang, X.L.²

17
- 84869505051
- Linearithmic time sparse and convex maximum margin clustering
- Dec.
- X. L. Zhang and J. Wu, "Linearithmic time sparse and convex maximum margin clustering," IEEE Trans. Syst., Man, Cybern. B, Cybern., vol. 42, no. 6, pp. 1669-1692, Dec. 2012.
- (2012) IEEE Trans. Syst., Man, Cybern. B, Cybern , vol.42 , Issue.6 , pp. 1669-1692
- Zhang, X.L.¹ Wu, J.²

18
- 85008579584
- Multiple acoustic model-based discriminative likelihood ratio weighting for voice activity detection
- Y. Suh and H. Kim, "Multiple acoustic model-based discriminative likelihood ratio weighting for voice activity detection," IEEE Signal Process. Lett., vol. 19, no. 8, pp. 507-510, 2012.
- (2012) IEEE Signal Process. Lett , vol.19 , Issue.8 , pp. 507-510
- Suh, Y.¹ Kim, H.²

19
- 33750216968
- SVM-based speech endpoint detection using contextual speech features
- J. Ramírez, P. Yélamos, J. M. Górriz, and J. C. Segura, "SVM-based speech endpoint detection using contextual speech features," Electron. Lett., vol. 42, no. 7, pp. 426-428, 2006.
- (2006) Electron. Lett , vol.42 , Issue.7 , pp. 426-428
- Ramírez, J.¹ Yélamos, P.² Górriz, J.M.³ Segura, J.C.⁴

20
- 78649271854
- Online unsupervised classification with model comparison in the variational bayes framework for voice activity detection
- Dec.
- D. Cournapeau, S.Watanabe, A. Nakamura, and T. Kawahara, "Online unsupervised classification with model comparison in the variational bayes framework for voice activity detection," IEEE J. Sel. Topics Signal Process., vol. 4, no. 6, pp. 1071-1083, Dec. 2010.
- (2010) IEEE J. Sel. Topics Signal Process , vol.4 , Issue.6 , pp. 1071-1083
- Cournapeau, D.¹ Watanabe, S.² Nakamura, A.³ Kawahara, T.⁴

21
- 80053614636
- Voice activity detection based on an unsupervised learning framework
- Nov.
- D.Ying, Y. Yan, J.Dang, and F. Soong, "Voice activity detection based on an unsupervised learning framework," IEEE Trans. Audio, Speech, Lang. Process., vol. 19, no. 8, pp. 2624-2644, Nov. 2011.
- (2011) IEEE Trans. Audio, Speech, Lang. Process , vol.19 , Issue.8 , pp. 2624-2644
- Ying, D.¹ Yan, Y.² Dang, J.³ Soong, F.⁴

22
- 0032762471
- A statistical model-based voice activity detection
- J. Sohn, N. S. Kim, and W. Sung, "A statistical model-based voice activity detection," IEEE Signal Process. Lett., vol. 6, no. 1, pp. 1-3, 1999.
- (1999) IEEE Signal Process. Lett , vol.6 , Issue.1 , pp. 1-3
- Sohn, J.¹ Kim, N.S.² Sung, W.³

23
- 0042863279
- A soft voice activity detector based on a Laplacian-Gaussian model
- Sep
- S. Gazor and W. Zhang, "A soft voice activity detector based on a Laplacian-Gaussian model," IEEE Trans. Speech, Audio Process., vol. 11, no. 5, pp. 498-505, Sep. 2003.
- (2003) IEEE Trans. Speech, Audio Process , vol.11 , Issue.5 , pp. 498-505
- Gazor, S.¹ Zhang, W.²

24
- 1842476689
- Efficient voice activity detection algorithms using long-term speech information
- J. Ramírez, J. C. Segura, C. Benitez, A. D. L. Torre, and A. Rubio, "Efficient voice activity detection algorithms using long-term speech information," Speech Commun., vol. 42, no. 3-4, pp. 271-287, 2004.
- (2004) Speech Commun , vol.42 , Issue.3-4 , pp. 271-287
- Ramírez, J.¹ Segura, J.C.² Benitez, C.³ Torre, A.D.L.⁴ Rubio, A.⁵

25
- 23344452899
- Statistical voice activity detection using a multiple observation likelihood ratio test
- DOI 10.1109/LSP.2005.855551
- J. Ramírez, J. C. Segura, C. Benítez, L. García, and A. Rubio, "Statistical voice activity detection using a multiple observation likelihood ratio test," IEEE Signal Process. Lett., vol. 12, no. 10, pp. 689-692, Oct. 2005. (Pubitemid 41448576)
- (2005) IEEE Signal Processing Letters , vol.12 , Issue.10 , pp. 689-692
- Ramirez, J.¹ Segura, J.C.² Benitez, C.³ Garcia, L.⁴ Rubio, A.⁵

26
- 33744532633
- Voice activity detection based on multiple statistical models
- DOI 10.1109/TSP.2006.874403
- J. H. Chang, N. S. Kim, and S. K.Mitra, "Voice activity detection based on multiple statistical models," IEEE Trans. Signal Process., vol. 54, no. 6, pp. 1965-1976, Jun. 2006. (Pubitemid 43811393)
- (2006) IEEE Transactions on Signal Processing , vol.54 , Issue.6 , pp. 1965-1976
- Chang, J.-H.¹ Kim, N.S.² Mitra, S.K.³

27
- 64149119904
- Improved voice activity detection using contextualmultiple hypothesis testing for robust speech recognition
- Nov
- J.Ramírez, J. Segura, J.Górriz, and L.García, "Improved voice activity detection using contextualmultiple hypothesis testing for robust speech recognition," IEEE Trans. Audio, Speech, Lang. Process., vol. 15, no. 8, pp. 2177-2189, Nov. 2007.
- (2007) IEEE Trans. Audio, Speech, Lang. Process , vol.15 , Issue.8 , pp. 2177-2189
- Rramírez, J.¹ Segura, J.² Górriz, J.³ García, L.⁴

28
- 44149083948
- A soft voice activity detection using GARCH filter and variance Gamma distribution
- May
- R. Tahmasbi and S. Rezaei, "A soft voice activity detection using GARCH filter and variance Gamma distribution," IEEE Trans. Audio, Speech, Lang. Process., vol. 15, no. 4, pp. 1129-1134, May 2007.
- (2007) IEEE Trans. Audio, Speech, Lang. Process , vol.15 , Issue.4 , pp. 1129-1134
- Tahmasbi, R.¹ Rezaei, S.²

29
- 80052045343
- Convex combination of multiple statistical models with application to VAD
- Nov.
- T. Petsatodis, C. Boukis, F. Talantzis, Z. Tan, and R. Prasad, "Convex combination of multiple statistical models with application to VAD," IEEE Trans. Audio, Speech, Lang. Process., vol. 19, no. 8, pp. 2314-2327, Nov. 2011.
- (2011) IEEE Trans. Audio, Speech, Lang. Process , vol.19 , Issue.8 , pp. 2314-2327
- Petsatodis, T.¹ Boukis, C.² Talantzis, F.³ Tan, Z.⁴ Prasad, R.⁵

30
- 33745805403
- A fast learning algorithm for deep belief nets
- DOI 10.1162/neco.2006.18.7.1527
- G. Hinton, S. Osindero, and Y. Teh, "A fast learning algorithm for deep belief nets," Neural Comput., vol. 18, no. 7, pp. 1527-1554, 2006. (Pubitemid 44024729)
- (2006) Neural Computation , vol.18 , Issue.7 , pp. 1527-1554
- Hinton, G.E.¹ Osindero, S.² Teh, Y.-W.³

31
- 33746600649
- Reducing the dimensionality of data with neural networks
- DOI 10.1126/science.1127647
- G. Hinton and R. Salakhutdinov, "Reducing the dimensionality of data with neural networks," Science, vol. 313, no. 5786, pp. 504-507, 2006. (Pubitemid 44148451)
- (2006) Science , vol.313 , Issue.5786 , pp. 504-507
- Hinton, G.E.¹ Salakhutdinov, R.R.²

32
- 69349090197
- Learning deep architectures for AI
- Y. Bengio, "Learning deep architectures for AI," Foundat. Trends® in Mach. Learn., vol. 2, no. 1, pp. 1-127, 2009.
- (2009) Foundat. Trends® in Mach. Learn , vol.2 , Issue.1 , pp. 1-127
- Bengio, Y.¹

33
- 84862612564
- On contrastive divergence learning
- M. A. Carreira-Perpinan and G. E. Hinton, "On contrastive divergence learning," in Proc. Int. Conf. Artif. Intell. Stat., 2005, pp. 17-25.
- (2005) Proc. Int. Conf. Artif. Intell. Stat , pp. 17-25
- Carreira-Perpinan, M.A.¹ Hinton, G.E.²

34
- 84864073449
- Greedy layerwise training of deep networks
- Y. Bengio, P. Lamblin, D. Popovici, and H. Larochelle, "Greedy layerwise training of deep networks," Proc. Adv. Neural Inf. Process. Syst., vol. 19, pp. 153-161, 2007.
- (2007) Proc. Adv. Neural Inf. Process. Syst , vol.19 , pp. 153-161
- Bengio, Y.¹ Lamblin, P.² Popovici, D.³ Larochelle, H.⁴

35
- 84861125212
- A practical guide to training restricted Boltzmann machines
- G. Hinton, "A practical guide to training restricted Boltzmann machines," Momentum, vol. 9, pp. 1-19, 2010.
- Momentum , vol.9 , Issue.2010 , pp. 1-19
- Hinton, G.¹

36
- 79959858900
- Learning in the deepstructured conditional random fields
- D. Yu, L. Deng, and S. Wang, "Learning in the deepstructured conditional random fields," in Proc. NIPS Workshop, 2009, pp. 1-8.
- (2009) Proc. NIPS Workshop , pp. 1-8
- Yu, D.¹ Deng, L.² Wang, S.³

37
- 85032782045
- Deep learning and its applications to signal and information processing [exploratory dsp]
- Jan.
- D. Yu and L. Deng, "Deep learning and its applications to signal and information processing [exploratory dsp]," IEEE Signal Process.Mag., vol. 28, no. 1, pp. 145-154, Jan. 2011.
- (2011) IEEE Signal Process.Mag. , vol.28 , Issue.1 , pp. 145-154
- Yu, D.¹ Deng, L.²

38
- 56449095373
- A unified architecture for natural language processing: Deep neural networks with multitask learning
- R. Collobert and J.Weston, "A unified architecture for natural language processing: Deep neural networks with multitask learning," in Proc. 25th Int. Conf. Mach. Learn., 2008, pp. 160-167.
- (2008) Proc. 25th Int. Conf. Mach. Learn , pp. 160-167
- Collobert, R.¹ Weston, J.²

39
- 80052067786
- Reverberant speech segregation based on multipitch tracking and classification
- Nov.
- Z. Jin and D. L.Wang, "Reverberant speech segregation based on multipitch tracking and classification," IEEE Trans. Audio, Speech, Lang. Process., vol. 19, no. 8, pp. 2328-2337, Nov. 2011.
- (2011) IEEE Trans. Audio, Speech, Lang. Process , vol.19 , Issue.8 , pp. 2328-2337
- Jin, Z.¹ Wang, D.L.²

40
- 84863281307
- A tandemalgorithm for singing pitch extraction and voice separation from music accompaniment
- Jul.
- C. L. Hsu, D. L.Wang, J. S. R. Jang, and K.Hu, "A tandemalgorithm for singing pitch extraction and voice separation from music accompaniment," IEEE Trans. Audio, Speech, Lang. Process., vol. 20, no. 5, pp. 1482-1491, Jul. 2012.
- (2012) IEEE Trans. Audio, Speech, Lang. Process , vol.20 , Issue.5 , pp. 1482-1491
- Hsu, C.L.¹ Wang, D.L.² Jang, J.S.R.³ Hu, K.⁴

41
- 84870477511
- Exploring monaural features for classification-based speech segregation
- Jan.
- Y. X.Wang, K. Han, and D. L.Wang, "Exploring monaural features for classification-based speech segregation," IEEE Trans. Audio, Speech, Lang. Process., vol. 21, no. 2, pp. 270-279, Jan. 2013.
- (2013) IEEE Trans. Audio, Speech, Lang. Process , vol.21 , Issue.2 , pp. 270-279
- Wang, Y.X.¹ Han, K.² Wang, D.L.³

42
- 79959823361
- A new VAD framework using statistical model and human knowledge based empirical rule
- J. Wu, X. L. Zhang, and W. Li, "A new VAD framework using statistical model and human knowledge based empirical rule," in Proc. Interspeech-10, 2010, pp. 3090-3093.
- (2010) Proc. Interspeech-10 , pp. 3090-3093
- Wu, J.¹ Zhang, X.L.² Li, W.³

43
- 84869496026
- An efficient voice activity detection algorithm by combining statistical model and energy detection
- J. Wu and X. L. Zhang, "An efficient voice activity detection algorithm by combining statistical model and energy detection," EURASIP J. Adv. Signal Process., vol. 2011, no. 1, pp. 18-27, 2011.
- (2011) EURASIP J. Adv. Signal Process , vol.2011 , Issue.1 , pp. 18-27
- Wu, J.¹ Zhang, X.L.²

44
- 77949522811
- Why does unsupervised pre-training help deep learning?
- D. Erhan, Y. Bengio, A. Courville, P. A. Manzagol, P. Vincent, and S. Bengio, "Why does unsupervised pre-training help deep learning?," J. Mach. Learn. Res., vol. 11, pp. 625-660, 2010.
- (2010) J. Mach. Learn. Res , vol.11 , pp. 625-660
- Erhan, D.¹ Bengio, Y.² Courville, A.³ Manzagol, P.A.⁴ Vincent, P.⁵ Bengio, S.⁶

45
- 84987702417
- The AURORA experimental framework for the performance evaluation of speech recognition systems under noisy conditions
- D. Pearce et al., "The AURORA experimental framework for the performance evaluation of speech recognition systems under noisy conditions," Proc. ICSLP-00, vol. 4, pp. 29-32, 2000.
- (2000) Proc. ICSLP-00 , vol.4 , pp. 29-32
- Pearce, D.¹

46
- 36049044257
- [Online]
- D. P. W. Ellis, "PLP and RASTA (and MFCC, and Inversion) in Matlab," 2005 [Online]. Available: http://www.ee.columbia. edu/~dpwe/resources/matlab/rastamat/
- (2005) PLP and RASTA (and MFCC, and Inversion) in Matlab
- Ellis, D.P.W.¹

47
- 0038712550
- Snr estimation based on amplitude modulation analysis with applications to noise suppression
- May
- J. Tchorz and B. Kollmeier, "Snr estimation based on amplitude modulation analysis with applications to noise suppression," IEEE Trans. Speech, Audio Process., vol. 11, no. 3, pp. 184-192, May 2003.
- (2003) IEEE Trans. Speech, Audio Process , vol.11 , Issue.3 , pp. 184-192
- Tchorz, J.¹ Kollmeier, B.²

48
- 70349093614
- An algorithm that improves speech intelligibility in noise for normal-hearing listeners
- G. Kim, Y. Lu, Y. Hu, and P. C. Loizou, "An algorithm that improves speech intelligibility in noise for normal-hearing listeners," J. Acoust. Soc. Amer., vol. 126, pp. 1486-1494, 2009.
- (2009) J. Acoust. Soc. Amer , vol.126 , pp. 1486-1494
- Kim, G.¹ Lu, Y.² Hu, Y.³ Loizou, P.C.⁴

49
- 0037767686
- A multipitch tracking algorithm for noisy speech
- May
- M.Wu, D. L.Wang, and G. J. Brown, "A multipitch tracking algorithm for noisy speech," IEEE Trans. Speech, Audio Process., vol. 11, no. 3, pp. 229-241, May 2003.
- (2003) IEEE Trans. Speech, Audio Process , vol.11 , Issue.3 , pp. 229-241
- Wu, M.¹ Wang, D.L.² Brown, G.J.³

50
- 4644265990
- Monaural speech segregation based on pitch tracking and amplitude modulation
- Sep
- G. Hu and D. Wang, "Monaural speech segregation based on pitch tracking and amplitude modulation," IEEE Trans. Neural Netw., vol. 15, no. 5, pp. 1135-1150, Sep. 2004.
- (2004) IEEE Trans. Neural Netw , vol.15 , Issue.5 , pp. 1135-1150
- Hu, G.¹ Wang, D.²

51
- 65249103478
- A supervised learning approach to monaural segregation of reverberant speech
- May
- Z. Jin and D. L. Wang, "A supervised learning approach to monaural segregation of reverberant speech," IEEE Trans. Audio, Speech, Lang. Process., vol. 17, no. 4, pp. 625-638, May 2009.
- (2009) IEEE Trans. Audio, Speech, Lang. Process , vol.17 , Issue.4 , pp. 625-638
- Jin, Z.¹ Wang, D.L.²

52
- 77955695149
- A tandem algorithm for pitch estimation and voiced speech segregation
- Nov.
- G. Hu and D. L. Wang, "A tandem algorithm for pitch estimation and voiced speech segregation," IEEE Trans. Audio, Speech, Lang. Process., vol. 18, no. 8, pp. 2067-2079, Nov. 2010.
- (2010) IEEE Trans. Audio, Speech, Lang. Process , vol.18 , Issue.8 , pp. 2067-2079
- Hu, G.¹ Wang, D.L.²

53
- 85008056718
- Hmm-based multipitch tracking for noisy and reverberant speech
- Jul.
- Z. Jin and D. L. Wang, "Hmm-based multipitch tracking for noisy and reverberant speech," IEEE Trans. Audio, Speech, Lang. Process., vol. 19, no. 5, pp. 1091-1102, Jul. 2011.
- (2011) IEEE Trans. Audio, Speech, Lang. Process , vol.19 , Issue.5 , pp. 1091-1102
- Jin, Z.¹ Wang, D.L.²

54
- 84863281307
- A tandemalgorithm for singing pitch extraction and voice separation from music accompaniment
- Jul.
- C. L. Hsu, D. L.Wang, J. S. R. Jang, and K.Hu, "A tandemalgorithm for singing pitch extraction and voice separation from music accompaniment," IEEE Trans. Audio, Speech, Lang. Process., vol. 20, no. 5, pp. 1482-1491, Jul. 2012.
- (2012) IEEE Trans. Audio, Speech, Lang. Process , vol.20 , Issue.5 , pp. 1482-1491
- Hsu, C.L.¹ Wang, D.L.² Jang, J.S.R.³ Hu, K.⁴

55
- 0036299273
- Pitch determination and voice quality analysis using subharmonic-to- harmonic ratio
- X. Sun, "Pitch determination and voice quality analysis using subharmonic-to-harmonic ratio," in Proc. Int. Conf. Acoust., Speech, Signal Process., 2002, vol. 1, pp. 333-336.
- (2002) Proc. Int. Conf. Acoust., Speech, Signal Process , vol.1 , pp. 333-336
- Sun, X.¹

56
- 84872355668
- Enhanced variable rate codec, speech service option 3 for wideband spectrum digital systems
- 3GPP2 C.S0014-A
- "Enhanced variable rate codec, speech service option 3 for wideband spectrum digital systems," TIA/EIA/IS-127, 2004, 3GPP2 C.S0014-A.
- (2004) TIA/EIA/IS-127

57
- 4944228528
- [Online]
- C. W. Hsu, C. C. Chang, and C. J. Lin, "A practical guide to support vector classification," 2003 [Online]. Available: http://www.csie.ntu. edu.tw/~cjlin/papers/guide/guide.pdf
- (2003) A Practical Guide to Support Vector Classification
- Hsu, C.W.¹ Chang, C.C.² Lin, C.J.³

58
- 68949154453
- Sparse kernel SVMs via cutting-plane training
- T. Joachims and C. N. J. Yu, "Sparse kernel SVMs via cutting-plane training," Mach. Learn., vol. 76, no. 2, pp. 179-193, 2009.
- (2009) Mach. Learn , vol.76 , Issue.2 , pp. 179-193
- Joachims, T.¹ Yu, C.N.J.²

59
- 77956547440
- Simple and efficient multiple kernel learning by group lasso
- Z. Xu, R. Jin, H. Yang, I. King, and M. R. Lyu, "Simple and efficient multiple kernel learning by group lasso," in Proc. 27th Int. Conf.Mach. Learn., 2010, pp. 1175-1182.
- (2010) Proc. 27th Int. Conf.Mach. Learn , pp. 1175-1182
- Xu, Z.¹ Jin, R.² Yang, H.³ King, I.⁴ Lyu, M.R.⁵

60
- 84872343315
- Deep learning of representations for unsupervised and transfer learning
- Y. Bengio, "Deep learning of representations for unsupervised and transfer learning," in Proc. ICML Workshop Unsupervised Transfer Learn., 2011, vol. 7, pp. 1-20.
- (2011) Proc. ICML Workshop Unsupervised Transfer Learn. , vol.7 , pp. 1-20
- Bengio, Y.¹

61
- 0003684441
- Cambridge MA: MIT Press
- A. S. Bregman, Auditory Scene Analysis: The Perceptual Organization of Sound. Cambridge, MA: MIT Press, 1994.
- (1994) Auditory Scene Analysis: The Perceptual Organization of Sound
- Bregman, A.S.¹

62
- 82255178542
- New York: Wiley-IEEE Press
- D. L.Wang and G. J. Brown, Computational Auditory Scene Analysis: Principles, Algorithms and Applications. New York: Wiley-IEEE Press, 2006.
- (2006) Computational Auditory Scene Analysis: Principles, Algorithms and Applications
- Wang, D.L.¹ Brown, G.J.²

63
- 0029206489
- Locally excitatory globally inhibitory oscillator networks
- Jan
- D. L. Wang and D. Terman, "Locally excitatory globally inhibitory oscillator networks," IEEE Trans. Neural Netw., vol. 6, no. 1, pp. 283-286, Jan. 1995.
- (1995) IEEE Trans. Neural Netw , vol.6 , Issue.1 , pp. 283-286
- Wang, D.L.¹ Terman, D.²

64
- 28244470718
- The time dimension for scene analysis
- Nov
- D. L. Wang, "The time dimension for scene analysis," IEEE Trans. Neural Netw., vol. 16, no. 6, pp. 1401-1426, Nov. 2005.
- (2005) IEEE Trans. Neural Netw , vol.16 , Issue.6 , pp. 1401-1426
- Wang, D.L.¹

65
- 85162494200
- Selecting receptive fields in deep networks
- A. Coates and A. Y. Ng, "Selecting receptive fields in deep networks," Proc. Adv. Neural Inf. Process. Syst., vol. 24, pp. 2528-2536, 2011.
- Proc. Adv. Neural Inf. Process. Syst , vol.24 , Issue.2011 , pp. 2528-2536
- Coates, A.¹ Ng, A.Y.²

66
- 85008054377
- Unvoiced speech segregation from nonspeech interference via casa and spectral subtraction
- Aug.
- K. Hu and D. L.Wang, "Unvoiced speech segregation from nonspeech interference via casa and spectral subtraction," IEEE Trans. Audio, Speech, Lang. Process., vol. 19, no. 6, pp. 1600-1609, Aug. 2011.
- (2011) IEEE Trans. Audio, Speech, Lang. Process , vol.19 , Issue.6 , pp. 1600-1609
- Hu, K.¹ Wang, D.L.²

67
- 84867946385
- An unsupervised approach to cochannel speech separation
- Jan.
- K. Hu and D. L. Wang, "An unsupervised approach to cochannel speech separation," IEEE Trans. Audio, Speech, Lang. Process., vol. 21, no. 1, pp. 122-131, Jan. 2013.
- (2013) IEEE Trans. Audio, Speech, Lang. Process , vol.21 , Issue.1 , pp. 122-131
- Hu, K.¹ Wang, D.L.²

68
- 70649111792
- Cambridge MA: MIT Press
- D. Koller and N. Friedman, Probabilistic Graphical Models: Principles and Techniques. Cambridge, MA: MIT Press, 2009.
- (2009) Probabilistic Graphical Models: Principles and Techniques
- Koller, D.¹ Friedman, N.²

69
- 77956031473
- A survey on transfer learning
- Oct.
- S. J. Pan and Q. Yang, "A survey on transfer learning," IEEE Trans. Knowl. Data Eng., vol. 22, no. 10, pp. 1345-1359, Oct. 2010.
- (2010) IEEE Trans. Knowl. Data Eng , vol.22 , Issue.10 , pp. 1345-1359
- Pan, S.J.¹ Yang, Q.²

70
- 85006786586
- Domain adaptation in machine learning and speech processing
- F. Sha and B. Kingsbury, "Domain adaptation in machine learning and speech processing," in Tutorial of Interspeech-12, 2012, pp. 1-214.
- (2012) Tutorial of Interspeech-12 , pp. 1-214
- Sha, F.¹ Kingsbury, B.²

71
- 56449089103
- Extracting and composing robust features with denoising autoencoders
- P. Vincent, H. Larochelle, Y. Bengio, and P. A.Manzagol, "Extracting and composing robust features with denoising autoencoders," in Proc. 25th Int. Conf. Mach. Learn., 2008, pp. 1096-1103.
- (2008) Proc. 25th Int. Conf. Mach. Learn , pp. 1096-1103
- Vincent, P.¹ Larochelle, H.² Bengio, Y.³ Manzagol, P.A.⁴

72
- 84867129067
- Marginalized denoising autoencoders for domain adaptation
- M. Chen, Z. Xu, K. Weinberger, and F. Sha, "Marginalized denoising autoencoders for domain adaptation," in Proc. 29th Int. Conf. Mach. Learn, 2012, pp. 1-8.
- (2012) Proc. 29th Int. Conf. Mach. Learn , pp. 1-8
- Chen, M.¹ Xu, Z.² Weinberger, K.³ Sha, F.⁴

73
- 84875681333
- Cocktail party processing via structured prediction
- Y. X.Wang and D. L.Wang, "Cocktail party processing via structured prediction," in Proc. Adv. Neural Inf. Process. Syst., 2012, pp. 1-8.
- (2012) Proc. Adv. Neural Inf. Process. Syst , pp. 1-8
- Wang, Y.X.¹ Wang, D.L.²

74
- 84883524644
- Building high-level features using large scale unsupervised learning
- Q. Le, R. Monga, M. Devin, G. Corrado, K. Chen, M. A. Ranzato, J. Dean, and A. Y. Ng, "Building high-level features using large scale unsupervised learning," in Proc. 29th Int. Conf. Mach. Learn., 2011, pp. 1-8.
- (2011) Proc. 29th Int. Conf. Mach. Learn , pp. 1-8
- Le, Q.¹ Monga, R.² Devin, M.³ Corrado, G.⁴ Chen, K.⁵ Ranzato, M.A.⁶ Dean, J.⁷ Ng, A.Y.⁸

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.