SCOPUS 정보 검색 플랫폼

ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings

Volumn , Issue , 2013, Pages 6704-6708

Deep neural network features and semi-supervised training for low resource speech recognition

(4) Thomas, Samuel a Seltzer, Michael L b Church, Kenneth c Hermansky, Hynek a

a JOHNS HOPKINS UNIVERSITY (United States)

b MICROSOFT RESEARCH (United States)

c IBM RESEARCH (United States)

Author keywords

bottleneck features; deep neural networks; Low resource; semi supervised training; speech recognition

Indexed keywords

ACOUSTIC MODEL; BOTTLENECK FEATURES; DEEP NEURAL NETWORKS; LARGE VOCABULARY CONTINUOUS SPEECH RECOGNITION; LOW RESOURCE; LOW-RESOURCE SETTINGS; LOW-RESOURCE SPEECH RECOGNITION; SEMI-SUPERVISED TRAININGS;

CONTINUOUS SPEECH RECOGNITION; SIGNAL PROCESSING; SPEECH RECOGNITION;

NEURAL NETWORKS;

EID: 84890474716 PISSN: 15206149 EISSN: None Source Type: Conference Proceeding
DOI: 10.1109/ICASSP.2013.6638959 Document Type: Conference Paper

Times cited : (164)

References (38)

1
- 70349220094
- A study on multilingual acoustic modeling for large vocabulary ASR
- H. Lin, L. Deng, D. Yu, Y. Gong, A. Acero, and C.H. Lee, "A study on multilingual acoustic modeling for large vocabulary ASR," in IEEE ICASSP, 2009
- (2009) IEEE ICASSP
- Lin, H.¹ Deng, L.² Yu, D.³ Gong, Y.⁴ Acero, A.⁵ Lee, C.H.⁶

2
- 84878401202
- Comparing different acoustic modeling techniques for multilingual boosting
- D. Imseng, J. Dines, P. Motlicek, P.N. Garner, and H. Bourlard, "Comparing different acoustic modeling techniques for multilingual boosting," in ISCA Interspeech, 2012
- (2012) ISCA Interspeech
- Imseng, D.¹ Dines, J.² Motlicek, P.³ Garner, P.N.⁴ Bourlard, H.⁵

3
- 78049394188
- Multilingual acoustic modeling for speech recognition based on subspace Gaussian mixture models
- L. Burget, P. Schwarz, M. Agarwal, P. Akyazi, K. Feng, A. Ghoshal, O. Glembek, N. Goel, M. Karafiat, D. Povey, et al., "Multilingual acoustic modeling for speech recognition based on subspace Gaussian mixture models," in IEEE ICASSP, 2010
- (2010) IEEE ICASSP
- Burget, L.¹ Schwarz, P.² Agarwal, M.³ Akyazi, P.⁴ Feng, K.⁵ Ghoshal, A.⁶ Glembek, O.⁷ Goel, N.⁸ Karafiat, M.⁹ Povey, D.¹⁰

4
- 84890456495
- Regularized subspace Gaussian mixture models for cross-lingual speech recognition
- L. Lu, A. Ghoshal, and S. Renals, "Regularized subspace Gaussian mixture models for cross-lingual speech recognition," in IEEE ASRU, 2011
- (2011) IEEE ASRU
- Lu, L.¹ Ghoshal, A.² Renals, S.³

5
- 84865804486
- State-level data borrowing for low-resource speech recognition based on subspace GMMs
- Y. Qian, D. Povey, and J. Liu, "State-level data borrowing for low-resource speech recognition based on subspace GMMs," in ISCA Interspeech, 2011
- (2011) ISCA Interspeech
- Qian, Y.¹ Povey, D.² Liu, J.³

6
- 84890500781
- On use of task independent training data in tandem feature extraction
- S. Sivadas and H. Hermansky, "On use of task independent training data in tandem feature extraction," in IEEE ICASSP, 2004
- (2004) IEEE ICASSP
- Sivadas, S.¹ Hermansky, H.²

7
- 84890483790
- Cross-domain and cross-language portability of acoustic features estimated by multilayer perceptrons
- A. Stolcke, F. Grezl, M.Y. Hwang, X. Lei, N. Morgan, and D. Vergyri, "Cross-domain and cross-language portability of acoustic features estimated by multilayer perceptrons," in IEEE ICASSP, 2006
- (2006) IEEE ICASSP
- Stolcke, A.¹ Grezl, F.² Hwang, M.Y.³ Lei, X.⁴ Morgan, N.⁵ Vergyri, D.⁶

8
- 79959819891
- Cross-lingual and multistream posterior features for low resource LVCSR systems
- S. Thomas, S. Ganapathy, and H. Hermansky, "Cross-lingual and multistream posterior features for low resource LVCSR systems," in ISCA Interspeech, 2010
- (2010) ISCA Interspeech
- Thomas, S.¹ Ganapathy, S.² Hermansky, H.³

9
- 84878582419
- Cross-lingual and ensemble MLPs-Strategies for low-resource speech recognition
- Y. Qian and J. Liu, "Cross-lingual and ensemble MLPs-Strategies for low-resource speech recognition," in ISCA Interspeech, 2012
- (2012) ISCA Interspeech
- Qian, Y.¹ Liu, J.²

10
- 84890458274
- Initialization schemes for multilayer perceptron training and their impact on ASR performance using multilingual data
- N. Thang, B. Wojtek, F. Metze, and T. Schultz, "Initialization schemes for multilayer perceptron training and their impact on ASR performance using multilingual data," in ISCA Interspeech, 2012
- (2012) ISCA Interspeech
- Thang, N.¹ Wojtek, B.² Metze, F.³ Schultz, T.⁴

11
- 84890513744
- Multilingual MLP features for low-resource LVCSR systems
- S. Thomas, S. Ganapathy, and H. Hermansky, "Multilingual MLP features for low-resource LVCSR systems," in IEEE ICASSP, 2012
- (2012) IEEE ICASSP
- Thomas, S.¹ Ganapathy, S.² Hermansky, H.³

12
- 84878392008
- Data-driven posterior features for low resource speech recognition applications
- S. Thomas, S. Ganapathy, A. Jansen, and H. Hermansky, "Data-driven posterior features for low resource speech recognition applications," in ISCA Interspeech, 2012
- (2012) ISCA Interspeech
- Thomas, S.¹ Ganapathy, S.² Jansen, A.³ Hermansky, H.⁴

13
- 84890453097
- Feature engineering in context-dependent deep neural networks for conversational speech transcription
- F. Seide, G. Li, X. Chen, and D. Yu, "Feature engineering in context-dependent deep neural networks for conversational speech transcription," in IEEE ASRU, 2011
- (2011) IEEE ASRU
- Seide, F.¹ Li, G.² Chen, X.³ Yu, D.⁴

14
- 85032751458
- Deep neural networks for acoustic modeling in speech recognition
- G. Hinton, L. Deng, D. Yu, G. Dahl, A. Mohamed, N. Jaitly, A. Senior, V. Vanhoucke, P. Nguyen, T. Sainath, and B. Kingsbury, "Deep neural networks for acoustic modeling in speech recognition," in IEEE Signal Processing Magazine, 2012
- (2012) IEEE Signal Processing Magazine
- Hinton, G.¹ Deng, L.² Yu, D.³ Dahl, G.⁴ Mohamed, A.⁵ Jaitly, N.⁶ Senior, A.⁷ Vanhoucke, V.⁸ Nguyen, P.⁹ Sainath, T.¹⁰ Kingsbury, B.¹¹

15
- 80055055551
- Why does unsupervised pre-training help deep learning
- D. Erhan, Y. Bengio, A. Courville, P.A. Manzagol, P. Vincent, and S. Bengio, "Why does unsupervised pre-training help deep learning?," JMLR, 2010
- (2010) JMLR
- Erhan, D.¹ Bengio, Y.² Courville, A.³ Manzagol, P.A.⁴ Vincent, P.⁵ Bengio, S.⁶

16
- 84055163920
- Roles of pre-training and fine-tuning in context-dependent DBN-HMMs for real-world speech recognition
- D. Yu, L. Deng, and G. Dahl, "Roles of pre-training and fine-tuning in context-dependent DBN-HMMs for real-world speech recognition," in NIPS Workshop, 2010
- (2010) NIPS Workshop
- Yu, D.¹ Deng, L.² Dahl, G.³

17
- 84055222005
- Context-dependent pre-trained deep neural networks for large-vocabulary speech recognition
- G.E. Dahl, D. Yu, L. Deng, and A. Acero, "Context-dependent pre-trained deep neural networks for large-vocabulary speech recognition," IEEE TASLP, 2012
- (2012) IEEE TASLP
- Dahl, G.E.¹ Yu, D.² Deng, L.³ Acero, A.⁴

18
- 84865785753
- Improved bottleneck features using pretrained deep neural networks
- D. Yu and M.L. Seltzer, "Improved bottleneck features using pretrained deep neural networks," ISCA Interspeech, 2011
- (2011) ISCA Interspeech
- Yu, D.¹ Seltzer, M.L.²

19
- 84890515212
- Autoencoder bottleneck features using deep belief networks
- T.N. Sainath, B. Kingsbury, and B. Ramabhadran, "Autoencoder bottleneck features using deep belief networks," in IEEE ICASSP, 2012
- (2012) IEEE ICASSP
- Sainath, T.N.¹ Kingsbury, B.² Ramabhadran, B.³

20
- 84890455972
- Making deep belief networks effective for large vocabulary continuous speech recognition
- T.N. Sainath, B. Kingsbury, B. Ramabhadran, P. Fousek, P. Novak, and A.R. Mohamed, "Making deep belief networks effective for large vocabulary continuous speech recognition," in IEEE ASRU, 2011
- (2011) IEEE ASRU
- Sainath, T.N.¹ Kingsbury, B.² Ramabhadran, B.³ Fousek, P.⁴ Novak, P.⁵ Mohamed, A.R.⁶

21
- 33745805403
- A fast learning algorithm for deep belief nets
- G.E. Hinton, S. Osindero, and Y.W. Teh, "A fast learning algorithm for deep belief nets," Neural computation, 2006
- (2006) Neural Computation
- Hinton, G.E.¹ Osindero, S.² Teh, Y.W.³

22
- 84864073449
- Greedy layer-wise training of deep networks
- Y. Bengio, P. Lamblin, D. Popovici, and H. Larochelle, "Greedy layer-wise training of deep networks," Advances in neural information processing systems, 2007
- (2007) Advances in Neural Information Processing Systems
- Bengio, Y.¹ Lamblin, P.² Popovici, D.³ Larochelle, H.⁴

23
- 34547548235
- Probabilistic and bottle-neck features for lvcsr of meetings
- F. Grezl, M. Karafiat, S. Kontar, and J. Cernocky, "Probabilistic and bottle-neck features for lvcsr of meetings," in IEEE ICASSP, 2007
- (2007) IEEE ICASSP
- Grezl, F.¹ Karafiat, M.² Kontar, S.³ Cernocky, J.⁴

24
- 84890500819
- Learning long-term temporal features in lvcsr using neural networks
- B. Chen, Q. Zhu, and N. Morgan, "Learning long-term temporal features in lvcsr using neural networks," in ISCA ICSLP, 2004
- (2004) ISCA ICSLP
- Chen, B.¹ Zhu, Q.² Morgan, N.³

25
- 34547516611
- Callhome american english speech
- A. Canavan, D. Graff, and G. Zipperlen, "Callhome american english speech," LDC, 1997
- (1997) LDC
- Canavan, A.¹ Graff, D.² Zipperlen, G.³

26
- 34547516611
- Callhome german speech
- A. Canavan, D. Graff, and G. Zipperlen, "Callhome german speech," LDC, 1997
- (1997) LDC
- Canavan, A.¹ Graff, D.² Zipperlen, G.³

27
- 84876052931
- Callhome spanish speech
- A. Canavan and G. Zipperlen, "Callhome spanish speech," LDC, 1997
- (1997) LDC
- Canavan, A.¹ Zipperlen, G.²

28
- 0002035663
- Switchboard: Telephone speech corpus for research and development
- J. Godfrey, E. Holliman, and J. McDaniel, "Switchboard: Telephone speech corpus for research and development," in IEEE ICASSP, 1992
- (1992) IEEE ICASSP
- Godfrey, J.¹ Holliman, E.² McDaniel, J.³

29
- 78049505017
- English gigaword
- D. Graff, J. Kong, K. Chen, and K. Maeda, "English gigaword," LDC, 2003
- (2003) LDC
- Graff, D.¹ Kong, J.² Chen, K.³ Maeda, K.⁴

30
- 0025041264
- Perceptual linear predictive (PLP) analysis of speech
- H. Hermansky, "Perceptual linear predictive (PLP) analysis of speech," The Journal of the Acoustical Society of America, 1990
- (1990) The Journal of the Acoustical Society of America
- Hermansky, H.¹

31
- 84890474252
- Phoneme recognition using spectral envelope and modulation frequency features
- S. Thomas, S. Ganapathy, and H. Hermansky, "Phoneme recognition using spectral envelope and modulation frequency features," in IEEE ICASSP, 2009
- (2009) IEEE ICASSP
- Thomas, S.¹ Ganapathy, S.² Hermansky, H.³

32
- 79959842582
- Using untranscribed training data to improve performance
- G. Zavaliagkos, M. Siu, T. Colthurst, and J. Billa, "Using untranscribed training data to improve performance," in ISCA ICSLP, 1998
- (1998) ISCA ICSLP
- Zavaliagkos, G.¹ Siu, M.² Colthurst, T.³ Billa, J.⁴

33
- 85135261720
- Unsupervised training of a speech recognizer: Recent experiments
- T. Kemp and A. Waibel, "Unsupervised training of a speech recognizer: Recent experiments," in ISCA Eurospeech, 1999
- (1999) ISCA Eurospeech
- Kemp, T.¹ Waibel, A.²

34
- 84890488335
- Unsupervised acoustic model training
- L. Lamel, J.L. Gauvain, and G. Adda, "Unsupervised acoustic model training," in IEEE ICASSP, 2002
- (2002) IEEE ICASSP
- Lamel, L.¹ Gauvain, J.L.² Adda, G.³

35
- 84890521566
- Unsupervised training on large amounts of broadcast news data
- J. Ma, S. Matsoukas, O. Kimball, and R. Schwartz, "Unsupervised training on large amounts of broadcast news data," in IEEE ICASSP, 2006
- (2006) IEEE ICASSP
- Ma, J.¹ Matsoukas, S.² Kimball, O.³ Schwartz, R.⁴

36
- 70450189191
- Analysis of low-resource acoustic model self-training
- S. Novotney and R. Schwartz, "Analysis of low-resource acoustic model self-training," in ISCA Interspeech, 2009
- (2009) ISCA Interspeech
- Novotney, S.¹ Schwartz, R.²

37
- 85135146711
- Estimating confidence using word lattices
- T. Kemp and T. Schaaf, "Estimating confidence using word lattices," in ISCA Eurospeech, 1997
- (1997) ISCA Eurospeech
- Kemp, T.¹ Schaaf, T.²

38
- 51449102200
- Combination of strongly and weakly constrained recognizers for reliable detection of OOVs
- L. Burget, P. Schwarz, P. Matejka, M. Hannemann, A. Rastrow, C. White, S. Khudanpur, H. Hermansky, and J. Cernocky, "Combination of strongly and weakly constrained recognizers for reliable detection of OOVs," in IEEE ICASSP, 2008.
- (2008) IEEE ICASSP
- Burget, L.¹ Schwarz, P.² Matejka, P.³ Hannemann, M.⁴ Rastrow, A.⁵ White, C.⁶ Khudanpur, S.⁷ Hermansky, H.⁸ Cernocky, J.⁹

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.