메뉴 건너뛰기




Volumn 29, Issue 6, 2012, Pages 70-81

Structured discriminative models for speech recognition: An overview

Author keywords

[No Author keywords available]

Indexed keywords

ACOUSTICS; CLASSIFICATION (OF INFORMATION); DEEP NEURAL NETWORKS; HIDDEN MARKOV MODELS; LEARNING ALGORITHMS; LEARNING SYSTEMS; MARKOV PROCESSES; NATURAL LANGUAGE PROCESSING SYSTEMS;

EID: 85032751545     PISSN: 10535888     EISSN: None     Source Type: Journal    
DOI: 10.1109/MSP.2012.2207140     Document Type: Review
Times cited : (27)

References (61)
  • 1
    • 0024610919 scopus 로고
    • A tutorial on hidden Markov models and selected applications in speech recognition
    • L. R. Rabiner, "A tutorial on hidden Markov models and selected applications in speech recognition," Proc. IEEE, vol. 77, no. 2, pp. 257-286, 1989.
    • (1989) Proc. IEEE , vol.77 , Issue.2 , pp. 257-286
    • Rabiner, L.R.1
  • 3
    • 70349227947 scopus 로고    scopus 로고
    • The application of hidden Markov models in speech recognition
    • M. J. F. Gales and S. J. Young, "The application of hidden Markov models in speech recognition," Found. Trends Signal Processing, vol. 1, no. 3, pp. 195-304, 2008.
    • (2008) Found. Trends Signal Processing , vol.1 , Issue.3 , pp. 195-304
    • Gales, M.J.F.1    Young, S.J.2
  • 4
    • 84957069814 scopus 로고    scopus 로고
    • Text Categorization with Support Vector Machines: Learning with many Relevant Features
    • Machine Learning: ECML-98
    • T. Joachims, "Text categorization with support vector machines: Learning with many relevant features," in Proc. ECML 1998, pp. 137-142. (Pubitemid 128067178)
    • (1998) Lecture Notes in Computer Science , Issue.1398 , pp. 137-142
    • Joachims, T.1
  • 5
    • 0142192295 scopus 로고    scopus 로고
    • Conditional random fields: Probabilistic models for segmenting and labeling sequence data
    • J. Lafferty, A. McCallum, and F. Pereira, "Conditional random fields: Probabilistic models for segmenting and labeling sequence data," in Proc. ICML 2001, pp. 282-289.
    • (2001) Proc. ICML , pp. 282-289
    • Lafferty, J.1    McCallum, A.2    Pereira, F.3
  • 6
    • 33947615175 scopus 로고    scopus 로고
    • Dynamic conditional random fields: Factorized probabilistic models for labeling and segmenting sequence data
    • C. Sutton, A. McCallum, and K. Rohanimanesh, "Dynamic conditional random fields: Factorized probabilistic models for labeling and segmenting sequence data," J. Mach. Learn. Res., vol. 8, pp. 693-723, 2007. (Pubitemid 46491655)
    • (2007) Journal of Machine Learning Research , vol.8 , pp. 693-723
    • Sutton, C.1    McCallum, A.2    Rohanimanesh, K.3
  • 7
    • 64849090489 scopus 로고    scopus 로고
    • Conditional random fields for integrating local discriminative classifiers
    • J. Morris and E. Fosler-Lussier, "Conditional random fields for integrating local discriminative classifiers," IEEE Trans. Audio Speech Lang. Processing, vol. 16, no. 3, pp. 617-628, 2008 .
    • (2008) IEEE Trans. Audio Speech Lang. Processing , vol.16 , Issue.3 , pp. 617-628
    • Morris, J.1    Fosler-Lussier, E.2
  • 10
    • 77949426518 scopus 로고    scopus 로고
    • Hidden conditional random fields for phone recognition
    • Y.-H. Sung and D. Jurafsky, "Hidden conditional random fields for phone recognition," in Proc. ASRU 2009, pp. 107-112.
    • (2009) Proc. ASRU , pp. 107-112
    • Sung, Y.-H.1    Jurafsky, D.2
  • 13
    • 77949407865 scopus 로고    scopus 로고
    • Investigations on features for log-linear acoustic models in continuous speech recognition
    • S. Wiesler, M. Nußbaum-Thom, G. Heigold, R. Schlüter, and H. Ney, "Investigations on features for log-linear acoustic models in continuous speech recognition," in Proc. ASRU 2009, pp. 52-57.
    • (2009) Proc. ASRU , pp. 52-57
    • Wiesler, S.1    Nußbaum-Thom, M.2    Heigold, G.3    Schlüter, R.4    Ney, H.5
  • 14
    • 77957744761 scopus 로고    scopus 로고
    • Structured log-linear models for noise robust speech recognition
    • S.- X. Zhang, A. Ragni, and M. J. F. Gales, "Structured log-linear models for noise robust speech recognition," IEEE Signal Processing Lett., vol. 17, no. 11, pp. 945-948, 2010.
    • (2010) IEEE Signal Processing Lett. , vol.17 , Issue.11 , pp. 945-948
    • Zhang, S.-X.1    Ragni, A.2    Gales, M.J.F.3
  • 15
    • 84858988048 scopus 로고    scopus 로고
    • Derivative kernels for noise robust ASR
    • A. Ragni and M. J. F. Gales, "Derivative kernels for noise robust ASR," in Proc. of ASRU 2011, pp. 119-124.
    • (2011) Proc. of ASRU , pp. 119-124
    • Ragni, A.1    Gales, M.J.F.2
  • 16
    • 84858977944 scopus 로고    scopus 로고
    • Extending noise robust structured support vector machines to larger vocabulary tasks
    • S .-X. Zhang and M. J. F. Gales, "Extending noise robust structured support vector machines to larger vocabulary tasks," in Proc. ASRU 2011, pp. 18-23.
    • (2011) Proc. ASRU , pp. 18-23
    • Zhang, S.-X.1    Gales, M.J.F.2
  • 17
    • 77949370075 scopus 로고    scopus 로고
    • A segmental CRF approach to large vocabulary continuous speech recognition
    • G. Zweig and P. Nguyen, "A segmental CRF approach to large vocabulary continuous speech recognition," in Proc. ASRU 2009, pp. 152-157.
    • (2009) Proc. ASRU , pp. 152-157
    • Zweig, G.1    Nguyen, P.2
  • 19
    • 34047246149 scopus 로고    scopus 로고
    • Maximum entropy direct models for speech recognition
    • DOI 10.1109/TSA.2005.858064
    • H. K. J. Kuo and Y. Gao, "Maximum entropy direct models for speech recognition," IEEE Trans. Audio Speech Lang. Processing, vol. 14, no. 3, pp. 873-881, 2006. (Pubitemid 46547649)
    • (2006) IEEE Transactions on Audio, Speech and Language Processing , vol.14 , Issue.3 , pp. 873-881
    • Kuo, H.-K.J.1    Gao, Y.2
  • 20
    • 70350435251 scopus 로고    scopus 로고
    • Speech recognition using augmented conditional random fields
    • Y. Hifny and S. Renals, "Speech recognition using augmented conditional random fields," IEEE Trans. Audio Speech Lang. Processing, vol. 17, no. 2, pp. 354-365, 2009.
    • (2009) IEEE Trans. Audio Speech Lang. Processing , vol.17 , Issue.2 , pp. 354-365
    • Hifny, Y.1    Renals, S.2
  • 21
    • 85149106909 scopus 로고    scopus 로고
    • Discriminative language modeling with conditional random fields and the perceptron algorithm
    • B. Roark, M. Saraclar, M. Collins, and M. Johnson, "Discriminative language modeling with conditional random fields and the perceptron algorithm," in Proc. ACL 2004, pp. 47-54.
    • (2004) Proc. ACL , pp. 47-54
    • Roark, B.1    Saraclar, M.2    Collins, M.3    Johnson, M.4
  • 22
    • 0037841402 scopus 로고    scopus 로고
    • Graphical models and automatic speech recognition
    • R. Rosenfeld, M. Ostendorf, S. Khudanpur, and M. Johnson, Eds. New York: Springer-Verlag
    • J. A. Bilmes, "Graphical models and automatic speech recognition," in Mathematical Foundations of Speech and Language Processing, R. Rosenfeld, M. Ostendorf, S. Khudanpur, and M. Johnson, Eds. New York: Springer-Verlag, 2003, pp. 191-245.
    • (2003) Mathematical Foundations of Speech and Language Processing , pp. 191-245
    • Bilmes, J.A.1
  • 23
    • 84935113569 scopus 로고
    • Error bounds for convolutional codes and asymptotically optimum decoding algorithm
    • A. J. Viterbi, "Error bounds for convolutional codes and asymptotically optimum decoding algorithm," IEEE Trans. Inform. Theory, vol. 13, no. 2, pp. 260-269, 1982.
    • (1982) IEEE Trans. Inform. Theory , vol.13 , Issue.2 , pp. 260-269
    • Viterbi, A.J.1
  • 24
    • 0020796537 scopus 로고
    • Decision theoretic formulation of a training problem in speech recognition and a comparison of training by unconditional versus conditional maximum likelihood
    • A. Nadas, "A decision theoretic formulation of a training problem in speech recognition and a comparison of training by unconditional versus conditional maximum likelihood," IEEE Trans. Acoustics Speech Signal Processing, vol. 31, no. 4, pp. 814-817, 198 3. (Pubitemid 14455162)
    • (1983) IEEE Transactions on Acoustics, Speech, and Signal Processing , vol.ASSP-31 , Issue.4 , pp. 814-817
    • Nadas Arthur1
  • 25
    • 0036296863 scopus 로고    scopus 로고
    • Minimum phone error and I-smoothing for improved discriminative training
    • D. Povey and P. C. Woodland, "Minimum phone error and I-smoothing for improved discriminative training," in Proc. ICASSP 2002, vol. 1, pp. 13-17.
    • (2002) Proc. ICASSP , vol.1 , pp. 13-17
    • Povey, D.1    Woodland, P.C.2
  • 26
    • 84864038630 scopus 로고    scopus 로고
    • Large margin hidden Markov models for automatic speech recognition
    • Cambridge, MA: MIT Press
    • F. Sha and L. K. Saul, "Large margin hidden Markov models for automatic speech recognition," in Advances in Neural Information Processing Systems. Cambridge, MA: MIT Press, 2007, pp. 1249-1256.
    • (2007) Advances in Neural Information Processing Systems , pp. 1249-1256
    • Sha, F.1    Saul, L.K.2
  • 28
    • 0002652285 scopus 로고    scopus 로고
    • A maximum entropy approach to natural language processing
    • A. L. Berger, S. A. Della Pietra, and V. J. Della-Pietra, "A maximum entropy approach to natural language processing," Comput. Linguist., vol. 22, no. 1, pp. 39-72, 1996.
    • (1996) Comput. Linguist. , vol.22 , Issue.1 , pp. 39-72
    • Berger, A.L.1    Della Pietra, S.A.2    Della-Pietra, V.J.3
  • 30
    • 29344450414 scopus 로고    scopus 로고
    • Support vector machine learning for interdependent and structured output spaces
    • I. Tsochantaridis, T. Hofmann, T. Joachims, and Y. Altun, "Support vector machine learning for interdependent and structured output spaces," in Proc. ICML 2004, pp. 104-112.
    • (2004) Proc. ICML , pp. 104-112
    • Tsochantaridis, I.1    Hofmann, T.2    Joachims, T.3    Altun, Y.4
  • 32
    • 0036460907 scopus 로고    scopus 로고
    • Weighted finite-state transducers in speech recognition
    • M. Mohri, F. Pereira, and M. Riley, "Weighted finite-state transducers in speech recognition," Comput. Speech Lang., vol. 16, no. 1, pp. 69-88, 2002.
    • (2002) Comput. Speech Lang. , vol.16 , Issue.1 , pp. 69-88
    • Mohri, M.1    Pereira, F.2    Riley, M.3
  • 33
    • 44849143432 scopus 로고    scopus 로고
    • Regularization, adaptation, and non-independent features improve hidden conditional random fields for phone classification
    • Y.-H. Sung, C. Boulis, C. Manning, and D. Jurafsky, "Regularization, adaptation, and non-independent features improve hidden conditional random fields for phone classification," in Proc. ASRU 2007, pp. 347-352.
    • (2007) Proc. ASRU , pp. 347-352
    • Sung, Y.-H.1    Boulis, C.2    Manning, C.3    Jurafsky, D.4
  • 34
    • 85012119646 scopus 로고    scopus 로고
    • On the equivalence of Gaussian HMM and Gaussian HMM-like hidden conditional random fields
    • G. Heigold, R. Schlüter, and H. Ney, "On the equivalence of Gaussian HMM and Gaussian HMM-like hidden conditional random fields," in Proc. Interspeech 2007, pp. 1721-1724.
    • (2007) Proc. Interspeech , pp. 1721-1724
    • Heigold, G.1    Schlüter, R.2    Ney, H.3
  • 36
    • 0030245363 scopus 로고    scopus 로고
    • From HMM's to segment models: A unified view of stochastic modeling for speech recognition
    • PII S1063667696067181
    • M. Ostendorf, V. Digilakis, and O. Kimball, "From HMMs to segment models: A unified view of stochastic modeling for speech recognition," IEEE Trans. Speech Audio Processing, vol. 4, no. 5, pp. 360-378, 1 996. (Pubitemid 126753024)
    • (1996) IEEE Transactions on Speech and Audio Processing , vol.4 , Issue.5 , pp. 360-378
    • Ostendorf, M.1    Digalakis, V.V.2    Kimball, O.A.3
  • 38
    • 34047192804 scopus 로고    scopus 로고
    • Semi-Markov conditional random fields for information extraction
    • Cambridge, MA: MIT Press
    • S. Sarawagi and W. W. Cohen, "Semi-Markov conditional random fields for information extraction," in Advances in Neural Information Processing Systems. Cambridge, MA: MIT Press, 2004, pp. 1185-1192.
    • (2004) Advances in Neural Information Processing Systems , pp. 1185-1192
    • Sarawagi, S.1    Cohen, W.W.2
  • 39
    • 24944537843 scopus 로고    scopus 로고
    • Large margin methods for structured and interdependent output variables
    • Sept.
    • I. Tsochantaridis, T. Joachims, T. Hofmann, and Y. Altun, "Large margin methods for structured and interdependent output variables," J. Mach. Learn. Res., vol. 6, pp. 1453-1484, Sept. 2005.
    • (2005) J. Mach. Learn. Res. , vol.6 , pp. 1453-1484
    • Tsochantaridis, I.1    Joachims, T.2    Hofmann, T.3    Altun, Y.4
  • 40
    • 33645766076 scopus 로고    scopus 로고
    • Minimum Bayes risk estimation and decoding in large vocabulary continuous speech recognition
    • W. Byrne, "Minimum Bayes risk estimation and decoding in large vocabulary continuous speech recognition," IEICE Special Issue Stat. Modeling Speech Recognition, vol. 89, no. 3, pp. 900-90 7, 2006.
    • (2006) IEICE Special Issue Stat. Modeling Speech Recognition , vol.89 , Issue.3 , pp. 900-907
    • Byrne, W.1
  • 42
    • 85127836544 scopus 로고    scopus 로고
    • Discriminative training methods for hidden Markov models: Theory and experiments with perceptron algorithms
    • M. Collins, "Discriminative training methods for hidden Markov models: Theory and experiments with perceptron algorithms," in Proc. EMNLP 2002, vol. 10, pp. 1-8.
    • (2002) Proc. EMNLP , vol.10 , pp. 1-8
    • Collins, M.1
  • 43
    • 0032665663 scopus 로고    scopus 로고
    • Efficient sampling and feature selection in whole sentence maximum enropy language models
    • S. F. Chen and R. Rosenfeld, "Efficient sampling and feature selection in whole sentence maximum enropy language models," in Proc. ICASSP 1999, p p. 549-552.
    • (1999) Proc. ICASSP , pp. 549-552
    • Chen, S.F.1    Rosenfeld, R.2
  • 45
    • 33746536145 scopus 로고    scopus 로고
    • Adaptation of maximum entropy capitalizer: Little data can help a lot
    • DOI 10.1016/j.csl.2005.05.005, PII S0885230805000276
    • C. Chelba and A. Acero, "Adaptation of maximum entropy capitalizer: Little data can help a lot," Comput. Speech Lang., vol. 20, no. 4, pp. 382-399, 2006. (Pubitemid 44142003)
    • (2006) Computer Speech and Language , vol.20 , Issue.4 , pp. 382-399
    • Chelba, C.1    Acero, A.2
  • 46
    • 51449092074 scopus 로고    scopus 로고
    • Maximum conditional likelihood linear regression and maximum a posteriori for hidden conditional random fields speaker adaptation
    • Y.-H. Sung, C. Boulis, and D. Jurafsky, "Maximum conditional likelihood linear regression and maximum a posteriori for hidden conditional random fields speaker adaptation," in Proc. ICASSP 2008, pp. 4293-4296.
    • (2008) Proc. ICASSP , pp. 4293-4296
    • Sung, Y.-H.1    Boulis, C.2    Jurafsky, D.3
  • 47
    • 79959843405 scopus 로고    scopus 로고
    • Discriminative adaptation for log-linear acoustic models
    • J. Loof, R. Schlüter, and H. Ney, "Discriminative adaptation for log-linear acoustic models," in Proc. Interspeech 2010, pp. 1648-1651.
    • (2010) Proc. Interspeech , pp. 1648-1651
    • Loof, J.1    Schlüter, R.2    Ney, H.3
  • 48
    • 77950857527 scopus 로고    scopus 로고
    • Discriminative classifiers with adaptive kernels for noise robust speech recognition
    • M. J. F. Gales and F. Flego, "Discriminative classifiers with adaptive kernels for noise robust speech recognition," Comput. Speech Lang., vol. 24, no. 4, pp. 648-662, 2010.
    • (2010) Comput. Speech Lang. , vol.24 , Issue.4 , pp. 648-662
    • Gales, M.J.F.1    Flego, F.2
  • 49
    • 84925661323 scopus 로고    scopus 로고
    • Rational kernels: Theory and algorithms
    • Dec.
    • C. Cortes, P. Haffner, and M. Mohri, "Rational kernels: Theory and algorithms," J. Mach. Learn. Res., vol. 5, pp. 1035-1062, Dec. 2004.
    • (2004) J. Mach. Learn. Res. , vol.5 , pp. 1035-1062
    • Cortes, C.1    Haffner, P.2    Mohri, M.3
  • 51
    • 84898982939 scopus 로고    scopus 로고
    • Exploiting generative models in discriminative classifiers
    • Cambridge, MA: MIT Press
    • T. Jaakkola and D. Hausser, "Exploiting generative models in discriminative classifiers," in Advances in Neural Information Processing Systems. Cambridge, MA: MIT Press, 1999, pp. 487-493.
    • (1999) Advances in Neural Information Processing Systems , pp. 487-493
    • Jaakkola, T.1    Hausser, D.2
  • 53
    • 84863387613 scopus 로고    scopus 로고
    • Shrinking exponential language models
    • S. F. Chen, "Shrinking exponential language models," in Proc. HLT-NAACL, 2009, pp. 468-476.
    • (2009) Proc. HLT-NAACL , pp. 468-476
    • Chen, S.F.1
  • 54
    • 80053411091 scopus 로고    scopus 로고
    • Discriminative syntactic language modeling for speech recognition
    • M. Collins, B. Roark, and M. Saraclar, "Discriminative syntactic language modeling for speech recognition," in Proc. of ACL 2005, pp. 507-514.
    • (2005) Proc. of ACL , pp. 507-514
    • Collins, M.1    Roark, B.2    Saraclar, M.3
  • 55
    • 78049374088 scopus 로고    scopus 로고
    • Syntactic and sub-lexical features for Turkish discriminative language models
    • E. Arisoy, M. Sara̧lar, B. Roark, and I. Shafran, "Syntactic and sub-lexical features for Turkish discriminative language models," in Proc. ICASSP 2010, pp. 5538-5541.
    • (2010) Proc. ICASSP , pp. 5538-5541
    • Arisoy, E.1    Sara̧lar, M.2    Roark, B.3    Shafran, I.4
  • 56
    • 79959846027 scopus 로고    scopus 로고
    • Large vocabulary continuous speech recognition using WFST-based linear classifier for structured data
    • S. Watanabe, T. Hori, and A. Nakamura, "Large vocabulary continuous speech recognition using WFST-based linear classifier for structured data," in Proc. Interspeech 2010, pp. 346-349.
    • (2010) Proc. Interspeech , pp. 346-349
    • Watanabe, S.1    Hori, T.2    Nakamura, A.3
  • 57
    • 79956277935 scopus 로고    scopus 로고
    • Learning a discriminative weighted finite-state transducer for speech recognition
    • M. Lehr and I. Shafran, "Learning a discriminative weighted finite-state transducer for speech recognition," IEEE Trans. Audio Speech Lang. Processing, vol. 19, no. 5, pp. 1360-1367, 2011.
    • (2011) IEEE Trans. Audio Speech Lang. Processing , vol.19 , Issue.5 , pp. 1360-1367
    • Lehr, M.1    Shafran, I.2
  • 59
    • 80051632228 scopus 로고    scopus 로고
    • Integrating meta-information into examplar-based speech recognition with segmental conditional random fields
    • K. Demuynck, D. Seppi, P. van Compernolle, D. Nguyen, and G. Zweig, "Integrating meta-information into examplar-based speech recognition with segmental conditional random fields," in Proc. ICASSP 2011, pp. 5048-5051.
    • (2011) Proc. ICASSP , pp. 5048-5051
    • Demuynck, K.1    Seppi, D.2    Van Compernolle, P.3    Nguyen, D.4    Zweig, G.5
  • 60
    • 69549111057 scopus 로고    scopus 로고
    • Cutting-plane training of structural SVMs
    • T. Joachims, T. Finley, and C.-N. Yu, "Cutting-plane training of structural SVMs," Mach. Learn., vol. 77, no. 1, pp. 27-59, 2 009.
    • (2009) Mach. Learn. , vol.77 , Issue.1 , pp. 27-59
    • Joachims, T.1    Finley, T.2    Yu, C.-N.3


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.