SCOPUS 정보 검색 플랫폼

IEEE Transactions on Audio, Speech and Language Processing

Volumn 21, Issue 3, 2013, Pages 498-507

Building acoustic model ensembles by data sampling with enhanced trainings and features

b University of Missouri (United States)

Author keywords

cross validation data sampling; discriminative training; Ensemble acoustic model; MLP feature; speaker clustering data sampling

Indexed keywords

ACOUSTIC MODEL; CROSS VALIDATION; DISCRIMINATIVE TRAINING; MLP FEATURE; SPEAKER CLUSTERING;

HIDDEN MARKOV MODELS; SPEECH RECOGNITION;

ARCHITECTURAL ACOUSTICS;

EID: 84872174281 PISSN: 15587916 EISSN: None Source Type: Journal
DOI: 10.1109/TASL.2012.2227729 Document Type: Article

Times cited : (14)

References (32)

1
- 80053403826
- Ensemble methods in machine learning
- T. G. Dietterich, "Ensemble methods in machine learning," in Proc. MCS, 2000, pp. 1-15.
- (2000) Proc. MCS , pp. 1-15
- Dietterich, T.G.¹

2
- 0035478854
- Random forests
- DOI 10.1023/A:1010933404324
- L. Breiman, "Random forests," Mach. Learn., vol. 45, pp. 5-32, 2001. (Pubitemid 32933532)
- (2001) Machine Learning , vol.45 , Issue.1 , pp. 5-32
- Breiman, L.¹

3
- 0030638031
- A post-processing system to yield reduced word error rates: Recognizer output voting error reduction (ROVER)
- J. G. Fiscus, "A post-processing system to yield reduced word error rates: Recognizer output voting error reduction (ROVER)," in Proc. IEEE ASRU Workshop, 1997, pp. 347-352.
- (1997) Proc. IEEE ASRU Workshop , pp. 347-352
- Fiscus, J.G.¹

4
- 85009080958
- Spontaneous speech recognition using a massively parallel decoder
- T. Shinozaki and S. Furui, "Spontaneous speech recognition using a massively parallel decoder," in Proc. ICSLP, 2004, pp. 1705-1708.
- (2004) Proc. ICSLP , pp. 1705-1708
- Shinozaki, T.¹ Furui, S.²

5
- 33646818291
- Constructing ensembles of ASR systems using randomized decision trees
- O. Siohan, B. Ramabhadran, and B. Kingsbury, "Constructing ensembles of ASR systems using randomized decision trees," in Proc. ICASSP, 2005, pp. I-197-I-200.
- (2005) Proc. ICASSP
- Siohan, O.¹ Ramabhadran, B.² Kingsbury, B.³

6
- 33745214897
- Investigations on ensemble based semi-supervised acoustic model training
- 9th European Conference on Speech Communication and Technology, Eurospeech Interspeech
- R. Zhang et al., "Investigations on ensemble based semi-supervised acoustic model training," in Proc. EuroSpeech, 2005, pp. 1677-1680. (Pubitemid 43908402)
- (2005) 9th European Conference on Speech Communication and Technology , pp. 1677-1680
- Zhang, R.¹ Bawab, Z.A.² Chan, A.³ Chotimongkol, A.⁴ Huggins-Daines, D.⁵ Rudnicky, A.I.⁶

7
- 0032677422
- Recent experiments in large vocabulary conversational speech recognition
- J. Billa, T. Colhurst, A. El-Jaroudi, R. Iyer, K. Ma, S. Matsoukas, C. Quillen, F. Richardson, M. Siu, G. Zavaliagkos, and H. Gish, "Recent experiments in large vocabulary conversational speech recognition," in Proc. ICASSP, 1999, vol. 1, pp. 41-44.
- (1999) Proc. ICASSP , vol.1 , pp. 41-44
- Billa, J.¹ Colhurst, T.² El-Jaroudi, A.³ Iyer, R.⁴ Ma, K.⁵ Matsoukas, S.⁶ Quillen, C.⁷ Richardson, F.⁸ Siu, M.⁹ Zavaliagkos, G.¹⁰ Gish, H.¹¹

8
- 79959833868
- Building multiple complementary systems using directed decision tree
- C. Bresline and M. J. F. Gales, "Building multiple complementary systems using directed decision tree," in Proc. Interspeech, 2007, pp. 1441-1444.
- (2007) Proc. Interspeech , pp. 1441-1444
- Bresline, C.¹ Gales, M.J.F.²

9
- 33645989784
- Boosting HMM acoustic models in large vocabulary speech recognition
- C. Meyer and H. Schramm, "Boosting HMM acoustic models in large vocabulary speech recognition," Speech Commun., vol. 48, pp. 532-548, 2006.
- (2006) Speech Commun , vol.48 , pp. 532-548
- Meyer, C.¹ Schramm, H.²

10
- 51549086717
- Random forests of phonetic decision trees for acoustic modeling in conversational speech recognition
- Mar
- J. Xue and Y. Zhao, "Random forests of phonetic decision trees for acoustic modeling in conversational speech recognition," IEEE Trans. Audio, Speech, Lang. Process., vol. 16, no. 3, pp. 519-528, Mar. 2008.
- (2008) IEEE Trans. Audio, Speech, Lang. Process , vol.16 , Issue.3 , pp. 519-528
- Xue, J.¹ Zhao, Y.²

11
- 70450191324
- A study of bootstrapping with multiple acoustic features for improved automatic speech recognition
- X. Cui, J. Xue, B. Xiang, and B. Zhou, "A study of bootstrapping with multiple acoustic features for improved automatic speech recognition," in Proc. Interspeech, 2009, pp. 240-243.
- (2009) Proc. Interspeech , pp. 240-243
- Cui, X.¹ Xue, J.² Xiang, B.³ Zhou, B.⁴

12
- 0034227757
- Cluster adaptive training of hidden Markov models
- Jul
- M. J. F. Gales, "Cluster adaptive training of hidden Markov models," IEEE Trans. Speech Audio Process., vol. 8, no. 4, pp. 417-428, Jul. 2000.
- (2000) IEEE Trans. Speech Audio Process , vol.8 , Issue.4 , pp. 417-428
- Gales, M.J.F.¹

13
- 0026982122
- Discriminative learning for minimum error classification
- DOI 10.1109/78.175747
- B.-H. Juang and S. Katagiri, "Discriminative learning for minimum error classification," IEEE Trans. Signal Process., vol. 40, no. 12, pp. 3043-3054, Dec. 1992. (Pubitemid 23603018)
- (1992) IEEE Transactions on Signal Processing , vol.40 , Issue.12 , pp. 3043-3054
- Juang Biing-Hwang¹ Katagiri Shigeru²

14
- 0022890536
- Maximum mutual information estimation of hiddenMarkov model parameters for speech recognition
- L. R. Bahl, P. F. Brown, P.V.D. Souza, and R. L. Mercer, "Maximum mutual information estimation of hiddenMarkov model parameters for speech recognition," in Proc. ICASSP, 1986, pp. 49-52.
- (1986) Proc. ICASSP , pp. 49-52
- Bahl, L.R.¹ Brown, P.F.² Souza, P.V.D.³ Mercer, R.L.⁴

15
- 0036296863
- Minimumphone error and I-smoothing for improved discriminative training
- D. Povey and P. C.Woodland, "Minimumphone error and I-smoothing for improved discriminative training," in Proc. ICASSP, 2002, vol. 1, pp. 105-108.
- (2002) Proc. ICASSP , vol.1 , pp. 105-108
- Povey, D.¹ Woodland, P.C.²

16
- 0033709098
- Tandem connectionist feature stream extraction for conventional HMM systems
- H. Hermansky, D. P. W. Ellis, and S. Sharma, "Tandem connectionist feature stream extraction for conventional HMM systems," in Proc. ICASSP, 2000, vol. III, pp. 1635-1638.
- (2000) Proc. ICASSP , vol.3 , pp. 1635-1638
- Hermansky, H.¹ Ellis, D.P.W.² Sharma, S.³

17
- 33745528628
- Using MLP features in SRI's conversational speech recognition system
- Q. Zhu, A. Stolcke, B. Y. Chen, and N. Morgan, "Using MLP features in SRI's conversational speech recognition system," in Proc. ICSLP, 2005, vol. 2, pp. 921-924.
- (2005) Proc. ICSLP , vol.2 , pp. 921-924
- Zhu, Q.¹ Stolcke, A.² Chen, B.Y.³ Morgan, N.⁴

18
- 0141629799
- Improved recognition by combining different features and different systems
- D. P. W. Ellis, "Improved recognition by combining different features and different systems," in Proc. AVIOS-2000, 2000.
- (2000) Proc. AVIOS-2000
- Ellis, D.P.W.¹

19
- 35549000218
- Cross-validation and aggregated EM training for robust parameter estimation
- DOI 10.1016/j.csl.2007.07.005, PII S0885230807000472
- T. Shinozaki and M. Ostendorf, "Cross-validation and aggregated EM training for robust parameter estimation," Comput. Speech Lang., vol. 22, no. 2, pp. 185-195, 2008. (Pubitemid 350016715)
- (2008) Computer Speech and Language , vol.22 , Issue.2 , pp. 185-195
- Shinozaki, T.¹ Ostendorf, M.²

20
- 0002144369
- Tree-based state tying for high accuracy modeling
- S. J. Young, J. J. Odell, and P. C. Woodland, "Tree-based state tying for high accuracy modeling," in Proc. ARPA Human Lang. Technol. Workshop, 1994, pp. 307-312.
- (1994) Proc. ARPA Human Lang. Technol. Workshop , pp. 307-312
- Young, S.J.¹ Odell, J.J.² Woodland, P.C.³

21
- 78049378080
- Cambridge, U.K. [Online]
- "HTK Toolkit," . Cambridge, U.K. [Online]. Available: http://htk. eng.cam.ac
- HTK Toolkit

22
- 0001217510
- Clustering bymeans of medoids
- L. Kaufman and P. J. Rousseeuw, Y. Dodge, Ed., "Clustering bymeans of medoids," in Statistical Data Analysis Based on the L1 Norm, 1987, pp. 405-416.
- (1987) Statistical Data Analysis Based on the L1 Norm , pp. 405-416
- Kaufman, L.¹ Rousseeuw, P.J.² Dodge, Y.³

23
- 34547516258
- Approximating the Kullback Leibler divergence between gaussian mixture models
- J. Hershey and P. Olsen, "Approximating the Kullback Leibler divergence between gaussian mixture models," in Proc. ICASSP, 2007, pp. 317-320.
- (2007) Proc. ICASSP , pp. 317-320
- Hershey, J.¹ Olsen, P.²

24
- 37249033137
- A new HMM-based ensemble generation method for numeral recognition
- Multiple Classifier Systems - 7th International Workshop, MCS 2007, Proceedings
- A. Ko, R. Sabourin, and A. S. Britto, Jr, "A new HMMbased ensemble generation method for numeral recognition," in Proc. MCS Workshop, 2007, pp. 52-61. (Pubitemid 350270602)
- (2007) Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) , vol.4472 , pp. 52-61
- Ko, A.H.-R.¹ Sabourin, R.² De Souza Britto Jr., A.³

25
- 78049379606
- Data sampling ensemble acoustic modeling in speaker independent speech recognition
- X. Chen and Y. Zhao, "Data sampling ensemble acoustic modeling in speaker independent speech recognition," in Proc. ICASSP, 2010, pp. 5130-5133.
- (2010) Proc. ICASSP , pp. 5130-5133
- Chen, X.¹ Zhao, Y.²

26
- 79959855899
- Integrating MLP features and discriminative training in data sampling based ensemble acoustic modeling
- X. Chen and Y. Zhao, "Integrating MLP features and discriminative training in data sampling based ensemble acoustic modeling," in Proc. Interspeech, 2010, pp. 1349-1352.
- (2010) Proc. Interspeech , pp. 1349-1352
- Chen, X.¹ Zhao, Y.²

27
- 0024768209
- Speaker-independent phone recognition using hidden Markov models
- Nov
- K.-F. Lee and H.-W. Hon, "Speaker-independent phone recognition using hidden Markov models," IEEE Trans. Audio, Speech, Signal Process., vol. 37, no. 11, pp. 1641-1648, Nov. 1989.
- (1989) IEEE Trans. Audio, Speech, Signal Process , vol.37 , Issue.11 , pp. 1641-1648
- Lee, K.-F.¹ Hon, H.-W.²

28
- 33947715150
- An automatic captioning system for telemedicine
- Y. Zhao, X. Zhang, R.-S. Hu, J. Xue, X. Li, L. Che, R. Hu, and L. Schopp, "An automatic captioning system for telemedicine," in Proc. ICASSP, 2006, pp. I-957-I-960.
- (2006) Proc. ICASSP
- Zhao, Y.¹ Zhang, X.² Hu, R.-S.³ Xue, J.⁴ Li, X.⁵ Che, L.⁶ Hu, R.⁷ Schopp, L.⁸

29
- 0344509344
- Phoneme probability estimationwith dynamic sparsely connected artificial networks
- N. Strom, "Phoneme probability estimationwith dynamic sparsely connected artificial networks," Free Speech J., no. 5, 1997.
- (1997) Free Speech J , Issue.5
- Strom, N.¹

30
- 70349218140
- Data sampling based ensemble acoustic modeling
- X. Chen and Y. Zhao, "Data sampling based ensemble acoustic modeling," in Proc. ICASSP, 2009, pp. 3805-3808.
- (2009) Proc. ICASSP , pp. 3805-3808
- Chen, X.¹ Zhao, Y.²

31
- 77949358999
- An exploration of large vocabulary tools for small vocabulary phonetic recognition
- T. N. Sainath, B. Ramabhadran, and M. Picheny, "An exploration of large vocabulary tools for small vocabulary phonetic recognition," in Proc. IEEE ASRU Workshop, 2009.
- (2009) Proc IEEE ASRU Workshop
- Sainath, T.N.¹ Ramabhadran, B.² Picheny, M.³

32
- 79952434203
- Deep belief networks for phone recognition
- A. R. Mohamed, G. Dahl, and G. E. Hinton, "Deep belief networks for phone recognition," in NIPS 22 Workshop Deep Learn. Speech Recognit., 2009.
- (2009) NIPS 22 Workshop Deep Learn. Speech Recognit
- Mohamed, A.R.¹ Dahl, G.² Hinton, G.E.³

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.