SCOPUS 정보 검색 플랫폼

Volumn 98, Issue , 2017, Pages 1-7

Hierarchical Bayesian combination of plug-in maximum a posteriori decoders in deep neural networks-based speech recognition and speaker adaptation

(3) Huang, Zhen a Siniscalchi, Sabato Marco a,b Lee, Chin Hui a

a GEORGIA INSTITUTE OF TECHNOLOGY (United States)

b KORE UNIVERSITY OF ENNA (Italy)

Author keywords

Automatic speech recognition; Bayesian learning; Deep neural networks; Sequential patterns; System combination

Indexed keywords

BAYESIAN NETWORKS; DECODING; DEEP NEURAL NETWORKS; HIDDEN MARKOV MODELS; HIERARCHICAL SYSTEMS; LOUDSPEAKERS; PATTERN RECOGNITION SYSTEMS;

AUTOMATIC SPEECH RECOGNITION; BAYESIAN LEARNING; CHARACTERISTIC DIFFERENCE; GAUSSIAN MIXTURE MODEL (GMMS); HIERARCHICAL BAYESIAN; MAXIMUM A POSTERIORI DECODERS; SEQUENTIAL PATTERNS; SYSTEM COMBINATION;

SPEECH RECOGNITION;

EID: 85026840738 PISSN: 01678655 EISSN: None Source Type: Journal
DOI: 10.1016/j.patrec.2017.08.001 Document Type: Article

Times cited : (10)

References (63)

1
- 84890452886
- Fast speaker adaptation of hybrid NN/HMM model for speech recognition based on discriminative learning of speaker code
- Abdel-Hamid, O., Jiang, H., Fast speaker adaptation of hybrid NN/HMM model for speech recognition based on discriminative learning of speaker code. Proceedings of ICASSP, 2013, 7942–7946.
- (2013) Proceedings of ICASSP , pp. 7942-7946
- Abdel-Hamid, O.¹ Jiang, H.²

2
- 84981748011
- Bayesian Theory
- Wiley
- Bernardo, J.M., Smith, A.F.M., Bayesian Theory. 1994, Wiley.
- (1994)
- Bernardo, J.M.¹ Smith, A.F.M.²

3
- 33846516584
- Pattern Recognition and Machine Learning
- Springer
- Bishop, C.M., Pattern Recognition and Machine Learning. 2006, Springer.
- (2006)
- Bishop, C.M.¹

4
- 17444409624
- A tutorial on the cross-entropy method
- Boer, P.T.D., Kroese, D.P., Mannor, S., Rubinstein, R.Y., A tutorial on the cross-entropy method. Ann. Oper. Res. 134:1 (2005), 19–67.
- (2005) Ann. Oper. Res. , vol.134 , Issue.1 , pp. 19-67
- Boer, P.T.D.¹ Kroese, D.P.² Mannor, S.³ Rubinstein, R.Y.⁴

5
- 0030211964
- Bagging predictors
- Breiman, L., Bagging predictors. Mach. Learn. 24 (1996), 123–140.
- (1996) Mach. Learn. , vol.24 , pp. 123-140
- Breiman, L.¹

6
- 0035478854
- Random forests
- Breiman, L., Random forests. Mach. Learn., 45, 2001, 2001.
- (2001) Mach. Learn. , vol.45 , pp. 2001
- Breiman, L.¹

7
- 84055222005
- Context-dependent pre-trained deep neural networks for large-vocabulary speech recognition
- Dahl, G.E., Yu, D., Deng, L., Acero, A., Context-dependent pre-trained deep neural networks for large-vocabulary speech recognition. IEEE Trans. Audio Speech Lang. Process. 20:1 (2012), 30–42.
- (2012) IEEE Trans. Audio Speech Lang. Process. , vol.20 , Issue.1 , pp. 30-42
- Dahl, G.E.¹ Yu, D.² Deng, L.³ Acero, A.⁴

8
- 0003759417
- Optimal statistical decisions
- John Wiley & Sons
- DeGroot, M.H., Morris, H., Optimal statistical decisions. 82, 2005, John Wiley & Sons.
- (2005) , vol.82
- DeGroot, M.H.¹ Morris, H.²

9
- 84876672166
- Machine learning paradigms for speech recognition: an overview
- Deng, L., Li, X., Machine learning paradigms for speech recognition: an overview. IEEE Trans. Audio Speech Lang. Process. 21 (2013), 1060–1089.
- (2013) IEEE Trans. Audio Speech Lang. Process. , vol.21 , pp. 1060-1089
- Deng, L.¹ Li, X.²

10
- 84910048046
- Ensemble deep learning for speech recognition.
- Deng, L., Platt, J.C., Ensemble deep learning for speech recognition. Proceedings of INTERSPEECH, 2014, 1915–1919.
- (2014) Proceedings of INTERSPEECH , pp. 1915-1919
- Deng, L.¹ Platt, J.C.²

11
- 4544253834
- Posterior probability decoding, confidence estimation and system combination
- Evermann, G., Woodland, P.C., Posterior probability decoding, confidence estimation and system combination. Proceedings of Speech Transcription Workshop, 27, 2000.
- (2000) Proceedings of Speech Transcription Workshop , vol.27
- Evermann, G.¹ Woodland, P.C.²

12
- 84910084579
- 2000 NIST evaluation of conversational speech recognition over the telephone: english and mandarin performance results
- Fiscus, J., Fisher, W.M., Martin, A.F., Przybocki, M.A., Pallett, D.S., 2000 NIST evaluation of conversational speech recognition over the telephone: english and mandarin performance results. Proceedings of Speech Transcription Workshop, 2000.
- (2000) Proceedings of Speech Transcription Workshop
- Fiscus, J.¹ Fisher, W.M.² Martin, A.F.³ Przybocki, M.A.⁴ Pallett, D.S.⁵

13
- 0030638031
- A post-processing system to yield reduced word error rates: recognizer output voting error reduction (rover)
- Fiscus, J.G., A post-processing system to yield reduced word error rates: recognizer output voting error reduction (rover). Proceedings of ASRU, 1997, 347–354.
- (1997) Proceedings of ASRU , pp. 347-354
- Fiscus, J.G.¹

14
- 0002978642
- Experiments with a new boosting algorithm
- Freund, Y., Schapire, R.E., Experiments with a new boosting algorithm. Proceedings of International Conference on Machine Learning, 1996, 148–156.
- (1996) Proceedings of International Conference on Machine Learning , pp. 148-156
- Freund, Y.¹ Schapire, R.E.²

15
- 0028419019
- Maximum a posteriori estimation for multivariate gaussian mixture observations of Markov chains
- Gauvain, J., Lee, C.-H., Maximum a posteriori estimation for multivariate gaussian mixture observations of Markov chains. IEEE Trans. Speech Audio Process., 2, 1994.
- (1994) IEEE Trans. Speech Audio Process. , vol.2
- Gauvain, J.¹ Lee, C.-H.²

16
- 0001596920
- Large vocabulary continuous speech recognition: advances and applications
- Gauvain, J.-L., Lamel, L., Large vocabulary continuous speech recognition: advances and applications. Proc. IEEE 88:8 (2000), 1181–1200.
- (2000) Proc. IEEE , vol.88 , Issue.8 , pp. 1181-1200
- Gauvain, J.-L.¹ Lamel, L.²

17
- 34548012893
- Linear hidden transformations for adaptation of hybrid ANN/HMM models
- Gemello, R., Mana, F., Scanzio, S., Laface, P., Mori, R.D., Linear hidden transformations for adaptation of hybrid ANN/HMM models. Speech Commun. 49:10 (2007), 827–835.
- (2007) Speech Commun. , vol.49 , Issue.10 , pp. 827-835
- Gemello, R.¹ Mana, F.² Scanzio, S.³ Laface, P.⁴ Mori, R.D.⁵

18
- 84994339086
- Dynamic stream weighting for turbo-decoding-based audiovisual ASR
- Gergen, S., Zeiler, S., Abdelaziz, A.H., Nickel, R., Kolossa, D., Dynamic stream weighting for turbo-decoding-based audiovisual ASR. Proceedings of Interspeech, San Francisco, CA, USA, 2016, 2135–2139.
- (2016) Proceedings of Interspeech, San Francisco, CA, USA , pp. 2135-2139
- Gergen, S.¹ Zeiler, S.² Abdelaziz, A.H.³ Nickel, R.⁴ Kolossa, D.⁵

19
- 84886580175
- Bayesian model combination
- Gatsby Computational Neuroscience Unit University College London
- Ghahramani, Z., Kim, H.-C., Bayesian model combination. Technical Report, 2003, Gatsby Computational Neuroscience Unit, University College London.
- (2003) Technical Report
- Ghahramani, Z.¹ Kim, H.-C.²

20
- 84859053384
- Switchboard-1 release 2
- Godfrey, J.J., Holliman, E., Switchboard-1 release 2. Linguistic Data Consortium, 1997.
- (1997) Linguistic Data Consortium
- Godfrey, J.J.¹ Holliman, E.²

21
- 34547548235
- Probabilistic and bottle-neck features for LVCSR of meetings
- Grézl, F., Karafiát, M., Kontár, S., Cernocky, J., Probabilistic and bottle-neck features for LVCSR of meetings. Proceedings of ICASSP, 4, 2007, IV–757.
- (2007) Proceedings of ICASSP , vol.4 , pp. IV-757
- Grézl, F.¹ Karafiát, M.² Kontár, S.³ Cernocky, J.⁴

22
- 85032751458
- Deep neural networks for acoustic modeling in speech recognition: the shared views of four research groups
- Hinton, G., Deng, L., Yu, D., Dahl, G.E., Mohamed, A., Jaitly, N., Senior, A., Vanhoucke, V., Nguyen, P., Sainath, T.N., et al. Deep neural networks for acoustic modeling in speech recognition: the shared views of four research groups. IEEE Signal Process. Mag. 29:6 (2012), 82–97.
- (2012) IEEE Signal Process. Mag. , vol.29 , Issue.6 , pp. 82-97
- Hinton, G.¹ Deng, L.² Yu, D.³ Dahl, G.E.⁴ Mohamed, A.⁵ Jaitly, N.⁶ Senior, A.⁷ Vanhoucke, V.⁸ Nguyen, P.⁹ Sainath, T.N.¹⁰

23
- 33746600649
- Reducing the dimensionality of data with neural networks
- Hinton, G.E., Salakhutdinov, R.R., Reducing the dimensionality of data with neural networks. Science 313:5786 (2006), 504–507.
- (2006) Science , vol.313 , Issue.5786 , pp. 504-507
- Hinton, G.E.¹ Salakhutdinov, R.R.²

24
- 84959169347
- Rapid adaptation for deep neural networks through multi-task learning
- Huang, Z., Li, J., Siniscalchi, S.M., Chen, I.-F., Wu, J., Lee, C.-H., Rapid adaptation for deep neural networks through multi-task learning. Proceedings of Interspeech, 2015.
- (2015) Proceedings of Interspeech
- Huang, Z.¹ Li, J.² Siniscalchi, S.M.³ Chen, I.-F.⁴ Wu, J.⁵ Lee, C.-H.⁶

25
- 84959161626
- Maximum a posteriori adaptation of network parameters in deep models
- Huang, Z., Siniscalchi, S.M., Chen, I.-F., Li, J., Wu, J., Lee, C.-H., Maximum a posteriori adaptation of network parameters in deep models. INTERSPEECH, 2015, 1076–1080.
- (2015) INTERSPEECH , pp. 1076-1080
- Huang, Z.¹ Siniscalchi, S.M.² Chen, I.-F.³ Li, J.⁴ Wu, J.⁵ Lee, C.-H.⁶

26
- 85002900398
- Bayesian unsupervised batch and online speaker adaptation of activation function parameters in deep models for automatic speech recognition
- Huang, Z., Siniscalchi, S.M., Lee, C.-H., Bayesian unsupervised batch and online speaker adaptation of activation function parameters in deep models for automatic speech recognition. IEEE/ACM Trans. Audio Speech Lang. Process., 25, 2017.
- (2017) IEEE/ACM Trans. Audio Speech Lang. Process. , vol.25
- Huang, Z.¹ Siniscalchi, S.M.² Lee, C.-H.³

27
- 85026832564
- Automatic quality estimation for ASR system combination
- Jalalvand, S., Negria, M., Falavignaa, D., Matassonia, M., Turchia, M., Automatic quality estimation for ASR system combination. Comput. Speech Lang., 2017.
- (2017) Comput. Speech Lang.
- Jalalvand, S.¹ Negria, M.² Falavignaa, D.³ Matassonia, M.⁴ Turchia, M.⁵

28
- 84910107057
- A dempster-shafer theory based combination of handwriting recognition systems with multiple rejection strategies
- Kessentini, Y., Burger, T., Paquet, T., A dempster-shafer theory based combination of handwriting recognition systems with multiple rejection strategies. Pattern Recognit. Lett. 48 (2015), 534–544.
- (2015) Pattern Recognit. Lett. , vol.48 , pp. 534-544
- Kessentini, Y.¹ Burger, T.² Paquet, T.³

29
- 0032021555
- On combining classifiers
- Kittler, J., Hatef, M., Duin, R.P.W., Matas, J., On combining classifiers. IEEE Trans. Pattern Anal. Mach. Intell. 20:3 (1998), 226–239.
- (1998) IEEE Trans. Pattern Anal. Mach. Intell. , vol.20 , Issue.3 , pp. 226-239
- Kittler, J.¹ Hatef, M.² Duin, R.P.W.³ Matas, J.⁴

30
- 0035509488
- Speech recognition and utterance verification based on a generalized confidence score
- Koo, M.-W., Lee, C.-H., Juang, B.-H., Speech recognition and utterance verification based on a generalized confidence score. IEEE Trans. Speech Audio Process. 9:8 (2001), 821–831.
- (2001) IEEE Trans. Speech Audio Process. , vol.9 , Issue.8 , pp. 821-831
- Koo, M.-W.¹ Lee, C.-H.² Juang, B.-H.³

31
- 0030351374
- On designing pronunciation lexicons for large vocabulary continuous speech recognition
- Lamel, L., Adda, G., On designing pronunciation lexicons for large vocabulary continuous speech recognition. Proceedings of ICSLP, 1, 1996.
- (1996) Proceedings of ICSLP , vol.1
- Lamel, L.¹ Adda, G.²

32
- 0000159105
- On adaptive decision rules and decision parameter adaptation for automatic speech recognition
- Lee, C.-H., Huo, Q., On adaptive decision rules and decision parameter adaptation for automatic speech recognition. Proc. IEEE 88 (2000), 1241–1269.
- (2000) Proc. IEEE , vol.88 , pp. 1241-1269
- Lee, C.-H.¹ Huo, Q.²

33
- 84876694595
- An information-extraction approach to speech processing: analysis, detection, verification, and recognition
- Lee, C.-H., Siniscalchi, S.M., An information-extraction approach to speech processing: analysis, detection, verification, and recognition. Proc. IEEE 101:5 (2013), 1089–1115.
- (2013) Proc. IEEE , vol.101 , Issue.5 , pp. 1089-1115
- Lee, C.-H.¹ Siniscalchi, S.M.²

34
- 84905262902
- Factorized adaptation for deep neural network
- Li, J., Huang, J.-T., Gong, Y., Factorized adaptation for deep neural network. Proceedings of ICASSP, 2014.
- (2014) Proceedings of ICASSP
- Li, J.¹ Huang, J.-T.² Gong, Y.³

35
- 0027683813
- Shared-distribution hidden markov models for speech recognition
- M.-Y. Hwang, M.-Y., Huang, X., Shared-distribution hidden markov models for speech recognition. IEEE Trans. Speech Audio Process. 1:4 (1993), 414–420.
- (1993) IEEE Trans. Speech Audio Process. , vol.1 , Issue.4 , pp. 414-420
- M.-Y. Hwang, M.-Y.¹ Huang, X.²

36
- 0034296009
- Finding consensus in speech recognition: word error minimization and other applications of confusion networks
- Mangu, L., Brill, E., Stolcke, A., Finding consensus in speech recognition: word error minimization and other applications of confusion networks. Comput. Speech Lang. 14:4 (2000), 373–400.
- (2000) Comput. Speech Lang. , vol.14 , Issue.4 , pp. 373-400
- Mangu, L.¹ Brill, E.² Stolcke, A.³

37
- 84994235596
- Fusion strategies for robust speech recognition and keyword spotting for channel- and noise-degraded speech
- Mitra, V., et al. Fusion strategies for robust speech recognition and keyword spotting for channel- and noise-degraded speech. Proceedings of Interspeech, San Francisco, CA, USA, 2016, 3683–3687.
- (2016) Proceedings of Interspeech, San Francisco, CA, USA , pp. 3683-3687
- Mitra, V.¹

38
- 64849090489
- Conditional random fields for integrating local discriminative classifiers
- Morris, J., Fosler-Lussier, E., Conditional random fields for integrating local discriminative classifiers. IEEE Trans. Audio Speech Lang. Process. 16:3 (2008), 617–628.
- (2008) IEEE Trans. Audio Speech Lang. Process. , vol.16 , Issue.3 , pp. 617-628
- Morris, J.¹ Fosler-Lussier, E.²

39
- 0022012892
- Optimal solution of a training problem in speech recognition
- Nadas, A., Optimal solution of a training problem in speech recognition. IEEE Trans. Acoust. Speech Signal Process. 33:1 (1985), 326–329.
- (1985) IEEE Trans. Acoust. Speech Signal Process. , vol.33 , Issue.1 , pp. 326-329
- Nadas, A.¹

40
- 0000635720
- Progresses in dynamic programming search for LVCSR
- Ney, H., Ortmanns, S., Progresses in dynamic programming search for LVCSR. Proc. IEEE 88:8 (2000), 1224–1240.
- (2000) Proc. IEEE , vol.88 , Issue.8 , pp. 1224-1240
- Ney, H.¹ Ortmanns, S.²

41
- 0003781238
- Markov Chains 2
- Cambridge University Press
- Norris, J.R., Markov Chains 2. 1998, Cambridge University Press.
- (1998)
- Norris, J.R.¹

42
- 84858953642
- The Kaldi speech recognition toolkit
- Povey, D., Ghoshal, A., Boulianne, G., Burget, L., Glembek, O., Goel, N., Hannemann, M., Motlicek, P., Qian, Y., Schwarz, P., Silovskỳ, J., Stemmer, G., Veselỳ, K., The Kaldi speech recognition toolkit. Proceedings of ASRU, 2011.
- (2011) Proceedings of ASRU
- Povey, D.¹ Ghoshal, A.² Boulianne, G.³ Burget, L.⁴ Glembek, O.⁵ Goel, N.⁶ Hannemann, M.⁷ Motlicek, P.⁸ Qian, Y.⁹ Schwarz, P.¹⁰ Silovskỳ, J.¹¹ Stemmer, G.¹² Veselỳ, K.¹³

43
- 84991384259
- Very deep convolutional neural networks for noise robust speech recognition
- Qian, Y., Bi, M., Tan, T., Yu, K., Very deep convolutional neural networks for noise robust speech recognition. IEEE/ACM Trans. Audio Speech Lang. Process. 24:12 (2016), 2263–2276.
- (2016) IEEE/ACM Trans. Audio Speech Lang. Process. , vol.24 , Issue.12 , pp. 2263-2276
- Qian, Y.¹ Bi, M.² Tan, T.³ Yu, K.⁴

44
- 0024610919
- A tutorial on hidden Markov models and selected applications in speech recognition
- Rabiner, L., A tutorial on hidden Markov models and selected applications in speech recognition. Proc. IEEE 77:2 (1989), 257–286.
- (1989) Proc. IEEE , vol.77 , Issue.2 , pp. 257-286
- Rabiner, L.¹

45
- 84929376602
- Bounded conditional mean imputation with observation uncertainties and acoustic model adaptation
- Remes, U., López, A.R., Palomäki, D., Bounded conditional mean imputation with observation uncertainties and acoustic model adaptation. IEEE/ACM Trans. Audio Speech Lang. Process. 23 (2015), 1198–1208.
- (2015) IEEE/ACM Trans. Audio Speech Lang. Process. , vol.23 , pp. 1198-1208
- Remes, U.¹ López, A.R.² Palomäki, D.³

46
- 75149176174
- Ensemble-based classifiers
- Rokach, L., Ensemble-based classifiers. Artif. Intell. Rev. 33:1–2 (2010), 1–39.
- (2010) Artif. Intell. Rev. , vol.33 , Issue.1-2 , pp. 1-39
- Rokach, L.¹

47
- 84893691530
- Speaker adaptation of neural network acoustic models using i-vectors
- Saon, G., Soltau, H., Nahamoo, D., Picheny, M., Speaker adaptation of neural network acoustic models using i-vectors. Proc. ASRU, 2013, 55–59.
- (2013) Proc. ASRU , pp. 55-59
- Saon, G.¹ Soltau, H.² Nahamoo, D.³ Picheny, M.⁴

48
- 84858976070
- Feature engineering in context-dependent deep neural networks for conversational speech transcription
- Seide, F., Li, G., Chen, X., Yu, D., Feature engineering in context-dependent deep neural networks for conversational speech transcription. Proc. ASRU, 2011, 24–29.
- (2011) Proc. ASRU , pp. 24-29
- Seide, F.¹ Li, G.² Chen, X.³ Yu, D.⁴

49
- 0035279111
- A structural Bayes approach to speaker adaptation
- Shinoda, K., Lee, C.-H., A structural Bayes approach to speaker adaptation. IEEE Trans. Speech Audio Process. 9:3 (2001), 276–287.
- (2001) IEEE Trans. Speech Audio Process. , vol.9 , Issue.3 , pp. 276-287
- Shinoda, K.¹ Lee, C.-H.²

50
- 84881054791
- Hermitian polynomial for speaker adaptation of connectionist speech recognition systems
- Siniscalchi, S.M., Li, J., Lee, C.-H., Hermitian polynomial for speaker adaptation of connectionist speech recognition systems. IEEE Trans. Audio Speech Lang. Process. 21:10 (2013), 2152–2161.
- (2013) IEEE Trans. Audio Speech Lang. Process. , vol.21 , Issue.10 , pp. 2152-2161
- Siniscalchi, S.M.¹ Li, J.² Lee, C.-H.³

51
- 84890492591
- Revisiting hybrid and GMM-HMM system combination techniques
- Swietojanski, P., Ghoshal, A., Renals, S., Revisiting hybrid and GMM-HMM system combination techniques. Proceedings of ICASSP, 2013, 6744–6748.
- (2013) Proceedings of ICASSP , pp. 6744-6748
- Swietojanski, P.¹ Ghoshal, A.² Renals, S.³

52
- 84976435936
- Learning hidden unit contributions for unsupervised acoustic model adaptation
- Swietojanski, P., Li, J., Renals, S., Learning hidden unit contributions for unsupervised acoustic model adaptation. IEEE/ACM Trans. Audio Speech Lang. Process. 24 (2016), 1450–1463.
- (2016) IEEE/ACM Trans. Audio Speech Lang. Process. , vol.24 , pp. 1450-1463
- Swietojanski, P.¹ Li, J.² Renals, S.³

53
- 85019835456
- Using line segments to train multi-stream stacked autoencoders for image classification
- Tang, X.-S., Has, K., Wei, H., Ding, Y., Using line segments to train multi-stream stacked autoencoders for image classification. Pattern Recognit. Lett. 94 (2017), 55–61.
- (2017) Pattern Recognit. Lett. , vol.94 , pp. 55-61
- Tang, X.-S.¹ Has, K.² Wei, H.³ Ding, Y.⁴

54
- 84935113569
- Error bounds for convolutional codes and an asymptotically optimum decoding algorithm
- Viterbi, A., Error bounds for convolutional codes and an asymptotically optimum decoding algorithm. IEEE Trans. Inf. Theory 13:2 (1967), 260–269.
- (1967) IEEE Trans. Inf. Theory , vol.13 , Issue.2 , pp. 260-269
- Viterbi, A.¹

55
- 84906237512
- Investigations on hessian-free optimization for cross-entropy training of deep neural networks.
- Wiesler, S., Li, J., Xue, J., Investigations on hessian-free optimization for cross-entropy training of deep neural networks. Proc. INTERSPEECH, 2013, 3317–3321.
- (2013) Proc. INTERSPEECH , pp. 3317-3321
- Wiesler, S.¹ Li, J.² Xue, J.³

56
- 0026692226
- Stacked generalization
- Wolpert, D., Stacked generalization. Neural Networks 5 (1992), 241–259.
- (1992) Neural Networks , vol.5 , pp. 241-259
- Wolpert, D.¹

57
- 79953250475
- Minimum Byes risk decoding and system combination based on a recursion for edit distance
- Xu, H., Povey, D., Mangu, L., Zhu, J., Minimum Byes risk decoding and system combination based on a recursion for edit distance. Comput. Speech Lang. 25:4 (2011), 802–828.
- (2011) Comput. Speech Lang. , vol.25 , Issue.4 , pp. 802-828
- Xu, H.¹ Povey, D.² Mangu, L.³ Zhu, J.⁴

58
- 84921731072
- Fast adaptation of deep neural network based on discriminant codes for speech recognition
- Xue, S., Abdel-Hamid, O., Jiang, H., Dai, L., Liu, Q., Fast adaptation of deep neural network based on discriminant codes for speech recognition. IEEE/ACM Trans. Audio Speech Lang. Process. 22:12 (2014), 1713–1725.
- (2014) IEEE/ACM Trans. Audio Speech Lang. Process. , vol.22 , Issue.12 , pp. 1713-1725
- Xue, S.¹ Abdel-Hamid, O.² Jiang, H.³ Dai, L.⁴ Liu, Q.⁵

59
- 84906225757
- A scalable approach to using DNN-derived features in GMM-HMM based acoustic modeling for LVCSR.
- Yan, Z., Huo, Q., Xu, J., A scalable approach to using DNN-derived features in GMM-HMM based acoustic modeling for LVCSR. Proceedings of INTERSPEECH, 2013, 104–108.
- (2013) Proceedings of INTERSPEECH , pp. 104-108
- Yan, Z.¹ Huo, Q.² Xu, J.³

60
- 84994361357
- Log-linear system combination using structured support vector machines
- Yang, J., Ragni, A., Gales, M.J.F., Knill, K.M., Log-linear system combination using structured support vector machines. Proceedings of Interspeech, San Francisco, CA, USA, 2016, 1898–1902.
- (2016) Proceedings of Interspeech, San Francisco, CA, USA , pp. 1898-1902
- Yang, J.¹ Ragni, A.² Gales, M.J.F.³ Knill, K.M.⁴

61
- 0002144369
- Tree-based state tying for high accuracy acoustic modelling
- Young, S.J., Odell, J.J., Woodland, P.C., Tree-based state tying for high accuracy acoustic modelling. Proceedings of the Workshop on Human Language Technology, Association for Computational Linguistics, 1994, 307–312.
- (1994) Proceedings of the Workshop on Human Language Technology, Association for Computational Linguistics , pp. 307-312
- Young, S.J.¹ Odell, J.J.² Woodland, P.C.³

62
- 84865785753
- Improved bottleneck features using pretrained deep neural networks.
- Yu, D., Seltzer, M., Improved bottleneck features using pretrained deep neural networks. Proceedings of INTERSPEECH, 2011, 237–240.
- (2011) Proceedings of INTERSPEECH , pp. 237-240
- Yu, D.¹ Seltzer, M.²

63
- 84890542079
- KL-divergence regularized deep neural network adaptation for improved large vocabulary speech recognition
- Yu, D., Yao, K., Su, H., Li, G., Seide, F., KL-divergence regularized deep neural network adaptation for improved large vocabulary speech recognition. Proceedings of ICASSP, 2013, 7893–7897.
- (2013) Proceedings of ICASSP , pp. 7893-7897
- Yu, D.¹ Yao, K.² Su, H.³ Li, G.⁴ Seide, F.⁵

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.