SCOPUS 정보 검색 플랫폼

Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH

Volumn 2015-January, Issue , 2015, Pages 2640-2644

Segmental conditional random fields with deep neural networks as acoustic models for first-password recognition

(2) He, Yanzhang a Fosler Lussier, Eric a

a The Ohio State University (United States)

Author keywords

First pass decoder; Segmental conditional random fields; Word recognition

Indexed keywords

ACOUSTIC FIELDS; DECODING; RANDOM PROCESSES; SPEECH COMMUNICATION; TELEPHONE SETS; VOCABULARY CONTROL;

DEEP NEURAL NETWORKS; FIRST-PASS DECODER; LATTICE GENERATIONS; LATTICE RESCORING; PASSWORD RECOGNITION; SEGMENTAL CONDITIONAL RANDOM FIELDS; UNIFIED FRAMEWORK; WORD RECOGNITION;

SPEECH RECOGNITION;

EID: 84959175560 PISSN: 2308457X EISSN: 19909772 Source Type: Conference Proceeding
DOI: None Document Type: Conference Paper

Times cited : (14)

References (31)

1
- 0030245363
- From hmm's to segment models: A unified view of stochastic modeling for speech recognition
- M. Ostendorf, V. V. Digalakis, and O. A. Kimball, "From hmm's to segment models: A unified view of stochastic modeling for speech recognition, " Speech and Audio Processing, IEEE Transactions on, vol. 4, no. 5, pp. 360-378, 1996.
- (1996) Speech and Audio Processing, IEEE Transactions on , vol.4 , Issue.5 , pp. 360-378
- Ostendorf, M.¹ Digalakis, V.V.² Kimball, O.A.³

2
- 0038359548
- A probabilistic framework for segment-based speech recognition
- J. R. Glass, "A probabilistic framework for segment-based speech recognition, " Computer Speech & Language, vol. 17, no. 2, pp. 137-152, 2003.
- (2003) Computer Speech & Language , vol.17 , Issue.2 , pp. 137-152
- Glass, J.R.¹

3
- 34547507549
- Morgan & Claypool, December
- L. Deng, DYNAMIC SPEECH MODELS-Theory, Algorithm, and Application. Morgan & Claypool, December 2006.
- (2006) DYNAMIC SPEECH MODELS-Theory, Algorithm, and Application
- Deng, L.¹

4
- 45549086638
- Template-based continuous speech recognition
- M. DeWachter, M. Matton, K. Demuynck, P. Wambacq, R. Cools, and D. Van Compernolle, "Template-based continuous speech recognition, " Audio, Speech, and Language Processing, IEEE Transactions on, vol. 15, no. 4, pp. 1377-1390, 2007.
- (2007) Audio, Speech, and Language Processing, IEEE Transactions on , vol.15 , Issue.4 , pp. 1377-1390
- DeWachter, M.¹ Matton, M.² Demuynck, K.³ Wambacq, P.⁴ Cools, R.⁵ Van Compernolle, D.⁶

5
- 33947702666
- Augmented statistical models for speech recognition
- IEEE
- M. Layton and M. Gales, "Augmented statistical models for speech recognition, " in Acoustics, Speech and Signal Processing, 2006. ICASSP 2006 Proceedings. 2006 IEEE International Conference on, vol. 1. IEEE, 2006, pp. I-I.
- (2006) Acoustics, Speech and Signal Processing, 2006. ICASSP 2006 Proceedings. 2006 IEEE International Conference on , vol.1 , pp. I-I
- Layton, M.¹ Gales, M.²

6
- 77949370075
- A segmental CRF approach to large vocabulary continuous speech recognition
- Merano, Italy, Dec
- G. Zweig and P. Nguyen, "A segmental CRF approach to large vocabulary continuous speech recognition, " in Proceedings of the IEEE Workshop on Automatic Speech Recognition Understanding (ASRU'09), Merano, Italy, Dec. 2009, pp. 152-157.
- (2009) Proceedings of the IEEE Workshop on Automatic Speech Recognition Understanding (ASRU'09) , pp. 152-157
- Zweig, G.¹ Nguyen, P.²

7
- 77957744761
- Structured log linear models for noise robust speech recognition
- S. Zhang, A. Ragni, and M. Gales, "Structured log linear models for noise robust speech recognition, " Signal Processing Letters, IEEE, vol. 17, no. 11, pp. 945-948, 2010.
- (2010) Signal Processing Letters, IEEE , vol.17 , Issue.11 , pp. 945-948
- Zhang, S.¹ Ragni, A.² Gales, M.³

8
- 80051659716
- Speech recognition with segmental conditional random fields: A summary of the JHU CLSP 2010 summer workshop
- Prague, Czech Republic, May
- G. Zweig, P. Nguyen, D. Van Compernolle, K. Demuynck, L. Atlas, P. Clark, G. Sell, M. Wang, F. Sha, H. Hermansky, D. Karakos, A. Jansen, S. Thomas, G. S. V. S. Sivaram, S. Bowman, and J. Kao, "Speech recognition with segmental conditional random fields: A summary of the JHU CLSP 2010 summer workshop, " in Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP'11), Prague, Czech Republic, May 2011, pp. 5044-5047.
- (2011) Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP'11) , pp. 5044-5047
- Zweig, G.¹ Nguyen, P.² Van Compernolle, D.³ Demuynck, K.⁴ Atlas, L.⁵ Clark, P.⁶ Sell, G.⁷ Wang, M.⁸ Sha, F.⁹ Hermansky, H.¹⁰ Karakos, D.¹¹ Jansen, A.¹² Thomas, S.¹³ Sivaram, G.S.V.S.¹⁴ Bowman, S.¹⁵ Kao, J.¹⁶

9
- 84867598637
- Classification and recognition with direct segment models
- Kyoto, Japan, Mar
- G. Zweig, "Classification and recognition with direct segment models, " in Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP'12), Kyoto, Japan, Mar. 2012, pp. 4161-4164.
- (2012) Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP'12) , pp. 4161-4164
- Zweig, G.¹

10
- 84878565391
- Efficient segmental conditional random fields for phone recognition
- Portland, OR, USA, Sep
- Y. He and E. Fosler-Lussier, "Efficient segmental conditional random fields for phone recognition, " in Proceedings of the Annual Conference of the International Speech Communication Association (Interspeech'12), Portland, OR, USA, Sep. 2012.
- (2012) Proceedings of the Annual Conference of the International Speech Communication Association (Interspeech'12)
- He, Y.¹ Fosler-Lussier, E.²

11
- 84906282118
- Deep segmental neural networks for speech recognition
- O. Abdel-Hamid, L. Deng, D. Yu, and H. Jiang, "Deep segmental neural networks for speech recognition, " in INTERSPEECH, 2013.
- (2013) INTERSPEECH
- Abdel-Hamid, O.¹ Deng, L.² Yu, D.³ Jiang, H.⁴

12
- 85032751458
- Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups
- G. Hinton, L. Deng, D. Yu, G. E. Dahl, A.-r. Mohamed, N. Jaitly, A. Senior, V. Vanhoucke, P. Nguyen, T. N. Sainath et al., "Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups, " Signal Processing Magazine, IEEE, vol. 29, no. 6, pp. 82-97, 2012.
- (2012) Signal Processing Magazine, IEEE , vol.29 , Issue.6 , pp. 82-97
- Hinton, G.¹ Deng, L.² Yu, D.³ Dahl, G.E.⁴ Mohamed, A.-R.⁵ Jaitly, N.⁶ Senior, A.⁷ Vanhoucke, V.⁸ Nguyen, P.⁹ Sainath, T.N.¹⁰

13
- 84890491198
- Recent advances in deep learning for speech research at microsoft
- L. Deng, J. Li, J.-T. Huang, K. Yao, D. Yu, F. Seide, M. Seltzer, G. Zweig, X. He, J. Williams et al., "Recent advances in deep learning for speech research at microsoft, " ICASSP 2013, 2013.
- (2013) ICASSP 2013
- Deng, L.¹ Li, J.² Huang, J.-T.³ Yao, K.⁴ Yu, D.⁵ Seide, F.⁶ Seltzer, M.⁷ Zweig, G.⁸ He, X.⁹ Williams, J.¹⁰

14
- 25444533246
- J. Lafferty, A. McCallum, and F. Pereira, "Conditional random fields: Probabilistic models for segmenting and labeling sequence data, " 2001.
- (2001) Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data
- Lafferty, J.¹ McCallum, A.² Pereira, F.³

15
- 84876691724
- Conditional random fields in speech, audio, and language processing
- E. Fosler-Lussier, Y. He, P. Jyothi, and R. Prabhavalkar, "Conditional random fields in speech, audio, and language processing, " Proceedings of the IEEE, vol. 101, no. 5, pp. 1054-1075, 2013.
- (2013) Proceedings of the IEEE , vol.101 , Issue.5 , pp. 1054-1075
- Fosler-Lussier, E.¹ He, Y.² Jyothi, P.³ Prabhavalkar, R.⁴

16
- 34047192804
- Semi-Markov conditional random fields for information extraction
- Vancouver, British Columbia, Canada, Dec
- S. Sarawagi andW. W. Cohen, "Semi-Markov conditional random fields for information extraction, " in Advances in Neural Information Processing Systems (NIPS'04), Vancouver, British Columbia, Canada, Dec. 2004, pp. 1185-1192.
- (2004) Advances in Neural Information Processing Systems (NIPS'04) , pp. 1185-1192
- Sarawagi, W.S.¹ Cohen, W.²

17
- 0003573244
- Kluwer Academic Publishers
- H. Bourlard and N. Morgan, Connectionist Speech Recognition: A Hybrid Approach. Kluwer Academic Publishers, 1993.
- (1993) Connectionist Speech Recognition: A Hybrid Approach
- Bourlard, H.¹ Morgan, N.²

18
- 85075929453
- Speech recognition with weighted finite-state transducers
- Springer
- M. Mohri, F. Pereira, and M. Riley, "Speech recognition with weighted finite-state transducers, " in Springer Handbook of Speech Processing. Springer, 2008, pp. 559-584.
- (2008) Springer Handbook of Speech Processing , pp. 559-584
- Mohri, M.¹ Pereira, F.² Riley, M.³

19
- 84876694602
- Ph. D. dissertation, The Ohio State University
- J. J. Morris, "A study on the use of conditional random fields for automatic speech recognition, " Ph. D. dissertation, The Ohio State University, 2010.
- (2010) A Study on the Use of Conditional Random Fields for Automatic Speech Recognition
- Morris, J.J.¹

20
- 84858953642
- The Kaldi speech recognition toolkit
- D. Povey, A. Ghoshal, G. Boulianne, L. Burget, O. Glembek, N. Goel, M. Hannemann, P. Motlicek, Y. Qian, P. Schwarz et al., "The Kaldi speech recognition toolkit, " in Proc. of ASRU, 2011, pp. 1-4.
- (2011) Proc. of ASRU , pp. 1-4
- Povey, D.¹ Ghoshal, A.² Boulianne, G.³ Burget, L.⁴ Glembek, O.⁵ Goel, N.⁶ Hannemann, M.⁷ Motlicek, P.⁸ Qian, Y.⁹ Schwarz, P.¹⁰

21
- 70349213445
- Lattice-based optimization of sequence classification criteria for neural-network acoustic modeling
- B. Kingsbury, "Lattice-based optimization of sequence classification criteria for neural-network acoustic modeling, " in Proc. of ICASSP, 2009, pp. 3761-3764.
- (2009) Proc. of ICASSP , pp. 3761-3764
- Kingsbury, B.¹

22
- 84906274730
- Sequencediscriminative training of deep neural networks
- K. Vesely, A. Ghoshal, L. Burget, and D. Povey, "Sequencediscriminative training of deep neural networks, " in INTERSPEECH, 2013.
- (2013) INTERSPEECH
- Vesely, K.¹ Ghoshal, A.² Burget, L.³ Povey, D.⁴

23
- 80052250414
- Adaptive subgradient methods for online learning and stochastic optimization
- J. Duchi, E. Hazan, and Y. Singer, "Adaptive subgradient methods for online learning and stochastic optimization, " J. Mach. Learn. Res., vol. 12, pp. 2121-2159, 2011.
- (2011) J. Mach. Learn. Res. , vol.12 , pp. 2121-2159
- Duchi, J.¹ Hazan, E.² Singer, Y.³

24
- 85032750905
- Discriminative learning in sequential pattern recognition
- Sep
- X. He, L. Deng, and W. Chou, "Discriminative learning in sequential pattern recognition, " IEEE Signal Processing Magazine, vol. 25, no. 5, pp. 14-36, Sep. 2008.
- (2008) IEEE Signal Processing Magazine , vol.25 , Issue.5 , pp. 14-36
- He, X.¹ Deng, L.² Chou, W.³

25
- 4544265717
- Ph. D. dissertation, University of Cambridge
- D. Povey, "Discriminative training for large vocabulary speech recognition, " Ph. D. dissertation, University of Cambridge, 2004.
- (2004) Discriminative Training for Large Vocabulary Speech Recognition
- Povey, D.¹

26
- 78049406405
- Backpropagation training for multilayer conditional random field based phone recognition
- R. Prabhavalkar and E. Fosler-Lussier, "Backpropagation training for multilayer conditional random field based phone recognition, " in Acoustics Speech and Signal Processing (ICASSP), 2010 IEEE International Conference on. IEEE, 2010, pp. 5534-5537.
- (2010) Acoustics Speech and Signal Processing (ICASSP), 2010 IEEE International Conference On. IEEE , pp. 5534-5537
- Prabhavalkar, R.¹ Fosler-Lussier, E.²

27
- 84872193462
- Structured svms for automatic speech recognition
- S.-X. Zhang and M. J. Gales, "Structured svms for automatic speech recognition, " Audio, Speech, and Language Processing, IEEE Transactions on, vol. 21, no. 3, pp. 544-555, 2013.
- (2013) Audio, Speech, and Language Processing, IEEE Transactions on , vol.21 , Issue.3 , pp. 544-555
- Zhang, S.-X.¹ Gales, M.J.²

28
- 84910091098
- A comparison of training approaches for discriminative segmental models
- H. Tang, K. Gimpel, and K. Livescu, "A comparison of training approaches for discriminative segmental models, " in Proc. Annual Conference of International Speech Communication Association (INTERSPEECH), 2014.
- (2014) Proc. Annual Conference of International Speech Communication Association (INTERSPEECH)
- Tang, H.¹ Gimpel, K.² Livescu, K.³

29
- 0028392167
- An application of recurrent nets to phone probability estimation
- A. J. Robinson, "An application of recurrent nets to phone probability estimation, " Neural Networks, IEEE Transactions on, vol. 5, no. 2, pp. 298-305, 1994.
- (1994) Neural Networks, IEEE Transactions on , vol.5 , Issue.2 , pp. 298-305
- Robinson, A.J.¹

30
- 84890543083
- Speech recognition with deep recurrent neural networks
- A. Graves, A.-r. Mohamed, and G. Hinton, "Speech recognition with deep recurrent neural networks, " in Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on. IEEE, 2013, pp. 6645-6649.
- (2013) Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference On. IEEE , pp. 6645-6649
- Graves, A.¹ Mohamed, A.-R.² Hinton, G.³

31
- 84910046405
- Long short-term memory recurrent neural network architectures for large scale acoustic modeling
- H. Sak, A. Senior, and F. Beaufays, "Long short-term memory recurrent neural network architectures for large scale acoustic modeling, " in Proceedings of the Annual Conference of International Speech Communication Association (INTERSPEECH), 2014.
- (2014) Proceedings of the Annual Conference of International Speech Communication Association (INTERSPEECH)
- Sak, H.¹ Senior, A.² Beaufays, F.³

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.