SCOPUS 정보 검색 플랫폼

ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings

Volumn , Issue , 2014, Pages 2519-2523

Analyzing convolutional neural networks for speech activity detection in mismatched acoustic conditions

(4) Thomas, Samuel a Ganapathy, Sriram a Saon, George a Soltau, Hagen a

a IBM T J WATSON RESEARCH CENTER (United States)

Author keywords

Convolutional neural networks; Neural network adaptation; Speech activity detection

Indexed keywords

NEURAL NETWORKS; SPEECH RECOGNITION;

ACOUSTIC CONDITIONS; CONVOLUTIONAL NEURAL NETWORK; DEEP NEURAL NETWORKS; NEURAL NETWORK ADAPTATION; PERFORMANCE DEGRADATION; SPEECH ACTIVITY DETECTIONS; STATE-OF-THE-ART PERFORMANCE; TWO-DIMENSIONAL FILTERS;

SIGNAL PROCESSING;

EID: 84905248050 PISSN: 15206149 EISSN: None Source Type: Conference Proceeding
DOI: 10.1109/ICASSP.2014.6854054 Document Type: Conference Paper

Times cited : (103)

References (26)

1
- 0033903480
- Robust voice activity detection algorithm for estimating noise spectrum
- K. Woo, T. Yang, K. Park, and C. Lee, "Robust Voice Activity Detection Algorithm for Estimating Noise Spectrum," IEEE Electronics Letters, 2000.
- (2000) IEEE Electronics Letters
- Woo, K.¹ Yang, T.² Park, K.³ Lee, C.⁴

2
- 85026719883
- Robust energy normalization using speech/non-speech discriminator for german connected digit recognition
- R. Chengalvarayan, "Robust Energy Normalization using Speech/Non-speech Discriminator for German Connected Digit Recognition," in ISCA Eurospeech, 1999.
- (1999) ISCA Eurospeech
- Chengalvarayan, R.¹

3
- 79851495972
- A silence compression scheme for g.729 optimized for terminals conforming to recommendation v.70
- Itu-T ITU-T, "A Silence Compression Scheme for G.729 Optimized for Terminals Conforming to Recommendation V.70," in Recommendation G.729-Annex B, 1996.
- (1996) Recommendation G.729-Annex B

4
- 84905230151
- Robust voice activity detection using higher-order statistics in the lpc residual domain
- E. Nemer, R. Goubran, and S. Mahmoud, "Robust Voice Activity Detection using Higher-order Statistics in the LPC Residual Domain," IEEE Electronics Letters, 2000.
- (2000) IEEE Electronics Letters
- Nemer, E.¹ Goubran, R.² Mahmoud, S.³

5
- 84905248283
- The segmentation of multichannel meeting recording for automatic speech recognition
- J. Dines, J. Vepa, and T. Hain, "The Segmentation of Multichannel Meeting Recording for Automatic Speech Recognition," ISCA ICSLP, 2006.
- (2006) ISCA ICSLP
- Dines, J.¹ Vepa, J.² Hain, T.³

6
- 17344389852
- Robust speech recognition in noisy environments: The 2001 ibm spine evaluation system
- B. Kingsbury, G. Saon, L. Mangu, M. Padmanabhan, and R. Sarikaya, "Robust Speech Recognition in Noisy Environments: The 2001 IBM SPINE Evaluation System," ISCA ICASSP, 2002.
- (2002) ISCA ICASSP
- Kingsbury, B.¹ Saon, G.² Mangu, L.³ Padmanabhan, M.⁴ Sarikaya, R.⁵

7
- 0025041264
- Perceptual linear predictive (plp) analysis of speech
- H. Hermansky, "Perceptual Linear Predictive (PLP) Analysis of Speech," The Journal of the Acoustical Society of America, 1990.
- (1990) The Journal of the Acoustical Society of America
- Hermansky, H.¹

8
- 0019053271
- Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences
- S. Davis and P. Mermelstein, "Comparison of Parametric Representations for Monosyllabic Word Recognition in Continuously Spoken Sentences," IEEE Transactions on Acoustics, Speech and Signal Processing, 1980.
- (1980) IEEE Transactions on Acoustics, Speech and Signal Processing
- Davis, S.¹ Mermelstein, P.²

9
- 33646064275
- Multi-resolution rasta filtering for tandem-based asr
- H. Hermansky and P. Fousek, "Multi-resolution RASTA Filtering for TANDEM-based ASR," in ISCA Interspeech, 2005.
- (2005) ISCA Interspeech
- Hermansky, H.¹ Fousek, P.²

10
- 84905248277
- Multi-layer perceptron based speech activity detection for speaker verification
- S. Ganapathy, P. Rajan, and H. Hermansky, "Multi-layer Perceptron based Speech Activity Detection for Speaker Verification," IEEE WASPAA, 2011.
- (2011) IEEE WASPAA
- Ganapathy, S.¹ Rajan, P.² Hermansky, H.³

11
- 34047272330
- Discrimination of speech from non-speech based on multiscale spectrotemporal modulations
- N. Mesgarani, M. Slaney, and S. Shamma, "Discrimination of Speech from Non-speech based on Multiscale Spectrotemporal Modulations," IEEE Transactions on Audio, Speech, and Language Processing, 2006.
- (2006) IEEE Transactions on Audio, Speech, and Language Processing
- Mesgarani, N.¹ Slaney, M.² Shamma, S.³

12
- 0032203257
- Gradient based learning applied to document recognition
- Y. Lecun, L. Bottou, Y. Bengio, and P. Haffner, "Gradient based Learning applied to Document Recognition," Proceedings of the IEEE, 1998.
- (1998) Proceedings of the IEEE
- Lecun, Y.¹ Bottou, L.² Bengio, Y.³ Haffner, P.⁴

13
- 84879123473
- The rats radio traffic collection system
- K.Walker and S. Strassel, "The RATS Radio Traffic Collection System," in ISCA Odyssey, 2012.
- (2012) ISCA Odyssey
- Walker, K.¹ Strassel, S.²

14
- 84878535284
- Developing a speech activity detection system for the darpa rats program
- T. Ng et al., "Developing a Speech Activity Detection system for the DARPA RATS Program," in ISCA Interspeech, 2012.
- (2012) ISCA Interspeech
- Ng, T.¹

15
- 84878590831
- Acoustic and data-driven features for robust speech activity detection
- S. Thomas et al., "Acoustic and Data-driven Features for Robust Speech Activity Detection," in ISCA Interspeech, 2012.
- (2012) ISCA Interspeech
- Thomas, S.¹

16
- 84906222432
- The ibm speech activity detection system for the darpa rats program
- G. Saon et al., "The IBM Speech Activity Detection System for the DARPA RATS Program," in ISCA Interspeech, 2013.
- (2013) ISCA Interspeech
- Saon, G.¹

17
- 84906277631
- Multi-band long-term signal variability features for robust voice activity detection
- A. Tsiartas et al., "Multi-band Long-term Signal Variability Features for Robust Voice Activity Detection," in ISCA Interspeech, 2013.
- (2013) ISCA Interspeech
- Tsiartas, A.¹

18
- 84906248945
- All for one: Feature combination for highly channel-degraded speech activity detection
- M. Graciarena et al., "All for One: Feature Combination for Highly Channel-degraded Speech Activity Detection," in ISCA Interspeech, 2013.
- (2013) ISCA Interspeech
- Graciarena, M.¹

19
- 77954761139
- Learning methods for generic object recognition with invariance to pose and lighting
- Y. Lecun, F. Huang, and L. Bottou, "Learning Methods for Generic Object Recognition with Invariance to Pose and Lighting," in IEEE CVPR, 2004.
- (2004) IEEE CVPR
- Lecun, Y.¹ Huang, F.² Bottou, L.³

20
- 84867605836
- Applying convolutional neural network concepts to hybrid nnhmmmodel for speech recognition
- O. Abdel-Hamid, A. Mohamed, H. Jiang, and G. Penn, "Applying Convolutional Neural Network concepts to Hybrid NNHMMmodel for Speech Recognition," in IEEE ICASSP, 2012.
- (2012) IEEE ICASSP
- Abdel-Hamid, O.¹ Mohamed, A.² Jiang, H.³ Penn, G.⁴

21
- 84890525984
- Deep convolutional neural networks for lvcsr
- T. Sainath, A. Mohamed, B. Kingsbury, and B. Ramabhadran, "Deep Convolutional Neural Networks for LVCSR," in IEEE ICASSP, 2013.
- (2013) IEEE ICASSP
- Sainath, T.¹ Mohamed, A.² Kingsbury, B.³ Ramabhadran, B.⁴

22
- 84906257050
- Neural network acoustic models for the darpa rats program
- H. Soltau, H.K. Kuo, L. Mangu, G. Saon, and T. Beran, "Neural Network Acoustic Models for the DARPA RATS Program," in ISCA Interspeech, 2013.
- (2013) ISCA Interspeech
- Soltau, H.¹ Kuo, H.K.² Mangu, L.³ Saon, G.⁴ Beran, T.⁵

23
- 84937880519
- Connectionist speaker normalization and adaptation
- V. Abrash, H. Franco, A. Sankar, and M. Cohen, "Connectionist Speaker Normalization and Adaptation," in ISCA Eurospeech, 1995.
- (1995) ISCA Eurospeech
- Abrash, V.¹ Franco, H.² Sankar, A.³ Cohen, M.⁴

24
- 34548012893
- Linear hidden transformations for adaptation of hybrid ann/hmm models
- R. Gemello, F. Mana, S. Scanzio, P. Laface, and R. De Mori, "Linear Hidden Transformations for Adaptation of Hybrid ANN/HMM Models," Speech Communication, 2007.
- (2007) Speech Communication
- Gemello, R.¹ Mana, F.² Scanzio, S.³ Laface, P.⁴ De Mori, R.⁵

25
- 84890478625
- Adaptation of context-dependent deep neural networks for automatic speech recognition
- K. Yao, D. Yu, F. Seide, H. Su, L.i Deng, and Y. Gong, "Adaptation of Context-dependent Deep Neural Networks for Automatic Speech Recognition," in IEEE SLT, 2012.
- (2012) IEEE SLT
- Yao, K.¹ Yu, D.² Seide, F.³ Su, H.⁴ Deng, L.I.⁵ Gong, Y.⁶

26
- 84906225505
- Rapid and effective speaker adaptation of convolutional neural network basedmodels for speech recognition
- O. Abdel-Hamid and H. Jiang, "Rapid and Effective Speaker Adaptation of Convolutional Neural Network basedModels for Speech Recognition," in ISCA Interspeech, 2013.
- (2013) ISCA Interspeech
- Abdel-Hamid, O.¹ Jiang, H.²

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.