메뉴 건너뛰기




Volumn 5, Issue 2, 1994, Pages 157-166

Learning Long-Term Dependencies with Gradient Descent is Difficult

Author keywords

[No Author keywords available]

Indexed keywords

ALGORITHMS; BINARY SEQUENCES; COMPUTATIONAL COMPLEXITY; DATA STORAGE EQUIPMENT; DISCRETE TIME CONTROL SYSTEMS; INFORMATION RETRIEVAL SYSTEMS; INPUT OUTPUT PROGRAMS; LEARNING SYSTEMS; PARAMETER ESTIMATION;

EID: 0028392483     PISSN: 10459227     EISSN: 19410093     Source Type: Journal    
DOI: 10.1109/72.279181     Document Type: Article
Times cited : (6642)

References (23)
  • 1
    • 0026836876 scopus 로고
    • Using Random Weights to train Multilayer Networks of Hard-Limiting Units
    • P. L. Bartlett, and T. Downs, “Using Random Weights to train Multilayer Networks of Hard-Limiting Units.” IEEE Transactions on Neural Networks, vol. 3, no. 2, 1992, pp. 202–210.
    • (1992) IEEE Transactions on Neural Networks , vol.3 , Issue.2 , pp. 202-210
    • Bartlett, P.L.1    Downs, T.2
  • 2
    • 0002906163 scopus 로고
    • Improving the convergence of backpropagation learning with second order methods
    • Touretzky, Hinton, and Sejnowski, Eds. San Matteo, CA: Morgan Kaufmann
    • S. Becker and Y. Le Cun, “Improving the convergence of backpropagation learning with second order methods,” Proceedings of the 1988 Connectionist Models Summer School, Touretzky, Hinton, and Sejnowski, Eds. San Matteo, CA: Morgan Kaufmann, pp. 29–37.
    • (1988) Proceedings of the 1988 Connectionist Models Summer School , pp. 29-37
    • Becker, S.1    Le Cun, Y.2
  • 3
    • 84943223999 scopus 로고
    • The problem of learning long-term dependencies in recurrent networks
    • San Francisco, IEEE Press
    • Y. Bengio, P. Frasconi, P. Simard, “The problem of learning long-term dependencies in recurrent networks,” invited paper at the IEEE International Conference on Neural Networks 1993, San Francisco, IEEE Press, pp. 1183–1188
    • (1993) IEEE International Conference on Neural Networks , pp. 1183-1188
    • Bengio, Y.1    Frasconi, P.2    Simard, P.3
  • 4
    • 0013269553 scopus 로고
    • Artificial neural networks and their application to sequence recognition
    • Ph.D. thesis, McGill University, Montreal, Quebec, Canada
    • Y. Bengio, “Artificial neural networks and their application to sequence recognition,” Ph.D. thesis, McGill University, 1991, Montreal, Quebec, Canada.
    • (1991)
    • Bengio, Y.1
  • 5
    • 0026835134 scopus 로고
    • Global optimization of a neural network—Hidden Markov model hybrid
    • Y. Bengio, R. De Mori, G. Flammia, and R. Kompe, “Global optimization of a neural network—Hidden Markov model hybrid,” IEEE Transactions on Neural Networks, vol. 3, no. 2, 1992, pp. 252–259.
    • (1992) IEEE Transactions on Neural Networks , vol.3 , Issue.2 , pp. 252-259
    • Bengio, Y.1    de Mori, R.2    Flammia, G.3    Kompe, R.4
  • 6
    • 0023416976 scopus 로고
    • Minimizing multimodal functions of continuous variables with the simulated annealing algorithm
    • Sept.
    • A. Corana, M. Marchesi, C. Martini, and S. Ridella, “Minimizing multimodal functions of continuous variables with the simulated annealing algorithm,” ACM Transactions on Mathematical Software, vol. 13, no. 13, Sept. 1987, pp. 262–280.
    • (1987) ACM Transactions on Mathematical Software , vol.13 , Issue.13 , pp. 262-280
    • Corana, A.1    Marchesi, M.2    Martini, C.3    Ridella, S.4
  • 7
    • 0001657855 scopus 로고
    • Local Feedback Multilayered Networks
    • P. Frasconi, M. Gori, and G. Soda, “Local Feedback Multilayered Networks,” Neural Computation 4(1), 1992, pp. 120–130.
    • (1992) Neural Computation , vol.4 , Issue.1 , pp. 120-130
    • Frasconi, P.1    Gori, M.2    Soda, G.3
  • 9
    • 0027277566 scopus 로고
    • A method of training multi-layer networks with heaviside characteristics using internal representations
    • San Francisco
    • R. J. Gaynier, and T. Downs, “A method of training multi-layer networks with heaviside characteristics using internal representations,” IEEE International Conference on Neural Networks 1993, San Francisco, pp. 1812–1817.
    • (1993) IEEE International Conference on Neural Networks , pp. 1812-1817
    • Gaynier, R.J.1    Downs, T.2
  • 10
    • 0024903887 scopus 로고
    • BPS: a learning algorithm for capturing the dynamic nature of speech
    • M. Gori, Y. Bengio, and R. De Mori, “BPS: a learning algorithm for capturing the dynamic nature of speech,” Proc. IEEE Int. Joint Conf. on Neural Networks, Washington, DC, 1989, pp. II.417-II.424.
    • (1989) Proc. IEEE Int. Joint Conf. on Neural Networks , pp. 417-424
    • Gori, M.1    Bengio, Y.2    De Mori, R.3
  • 13
    • 26444479778 scopus 로고
    • Optimization by simulated annealing
    • 4598 (May)
    • S. Kirkpatrick, C. D. Gelatt, M. P. Vecchi, “Optimization by simulated annealing,” Science 220, pp.671–680, 4598 (May 1983).
    • (1983) Science 220 , pp. 671-680
    • Kirkpatrick, S.1    Gelatt, C.D.2    Vecchi, M.P.3
  • 14
    • 84941528157 scopus 로고
    • A first look at phonetic discrimination using connectionist models with recurrent links
    • CCRP — IDA SCIMP working paper No.4/87, Institute for Defense Analysis, Princeton, NJ
    • G. Kuhn, “A first look at phonetic discrimination using connectionist models with recurrent links.” CCRP — IDA SCIMP working paper No.4/87, Institute for Defense Analysis, Princeton, NJ, 1987.
    • (1987)
    • Kuhn, G.1
  • 15
    • 0003966401 scopus 로고
    • The development of the Time-Delay Neural Network architecture for speech recognition
    • Technical Report CMU-CS-88-152, Carnegie-Mellon University
    • K. J. Lang and G. E. Hinton, “The development of the Time-Delay Neural Network architecture for speech recognition,” Technical Report CMU-CS-88-152, Carnegie-Mellon University, 1988.
    • (1988)
    • Lang, K.J.1    Hinton, G.E.2
  • 16
    • 0002824144 scopus 로고
    • Learning processes in an asymmetric threshold network
    • E. Bienenstock, F. Fogelman-Soulié, and G. Weisbuch, Eds. Les Houches, France: Springer-Verlag
    • Y. Le Cun, “Learning processes in an asymmetric threshold network,” in Disordered systems and biological organization,E. Bienenstock, F. Fogelman-Soulié, and G. Weisbuch, Eds. Les Houches, France: Springer-Verlag, 1986, pp. 233–240.
    • (1986) Disordered systems and biological organization , pp. 233-240
    • Le Cun, Y.1
  • 17
    • 0000380827 scopus 로고
    • Nonlinear dynamics and stability of analog neural networks
    • 51 (special issue)
    • C. M. Marcus, F. R. Waugh, and R. M. Westervelt, “Nonlinear dynamics and stability of analog neural networks,” Physica D 51 (special issue), pp. 234—247, 1991.
    • (1991) Physica D , pp. 234-247
    • Marcus, C.M.1    Waugh, F.R.2    Westervelt, R.M.3
  • 18
    • 0008554931 scopus 로고
    • A focused back-propagation algorithm for temporal pattern recognition
    • M. C. Mozer, “A focused back-propagation algorithm for temporal pattern recognition,” Complex Systems, 3, pp. 349–391, 1989.
    • (1989) Complex Systems, 3 , pp. 349-391
    • Mozer, M.C.1
  • 19
    • 0005316958 scopus 로고
    • Induction of multiscale temporal structure
    • Moody, Hanson, and Lippman, Eds. San Matteo, CA: Morgan Kaufmann
    • M. C. Mozer, “Induction of multiscale temporal structure,” in Advances in Neural Information Processing Systems 4, Moody, Hanson, and Lippman, Eds. San Matteo, CA: Morgan Kaufmann, 1992, pp. 275–282.
    • (1992) Advances in Neural Information Processing Systems 4 , pp. 275-282
    • Mozer, M.C.1
  • 21
    • 0004663635 scopus 로고
    • The ‘moving targets’ training algorithm
    • Touretzky, Ed. San Matteo, CA: Morgan Kaufmann
    • R. Rohwer, “The ‘moving targets’ training algorithm,” Advances in Neural Information Processing Systems 2, Touretzky, Ed. San Matteo, CA: Morgan Kaufmann, 1990, pp. 558–565.
    • (1990) Advances in Neural Information Processing Systems 2 , pp. 558-565
    • Rohwer, R.1
  • 22
    • 0000646059 scopus 로고
    • Learning internal representation by error propagation
    • D. E. Rumelhart and J. L. McClelland, Eds. Cambridge, MA: MIT Press, Bradford Books
    • D. E. Rumelhart, G. E. Hinton and R. J. Williams, “Learning internal representation by error propagation,” in Parallel Distributed Processing, D. E. Rumelhart and J. L. McClelland, Eds. Cambridge, MA: MIT Press, Bradford Books, vol. 1, 1986, pp. 318–362.
    • (1986) Parallel Distributed Processing , vol.1 , pp. 318-362
    • Rumelhart, D.E.1    Hinton, G.E.2    Williams, R.J.3
  • 23
    • 0001202594 scopus 로고
    • A learning algorithm for continuously running fully recurrent neural networks
    • R. J. Williams and D. Zipser “A learning algorithm for continuously running fully recurrent neural networks,” Neural Computation, 1, 1989, pp. 270–280.
    • (1989) Neural Computation, 1 , pp. 270-280
    • Williams, R.J.1    Zipser, D.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.