메뉴 건너뛰기




Volumn 6, Issue 2, 1998, Pages 107-116

The vanishing gradient problem during learning recurrent neural nets and problem solutions

Author keywords

Long Short Term Memory; Long term dependencies; Recurrent neural nets; Vanishing gradient

Indexed keywords


EID: 0042276525     PISSN: 02184885     EISSN: None     Source Type: Journal    
DOI: 10.1142/S0218488598000094     Document Type: Article
Times cited : (2362)

References (30)
  • 2
    • 0028401031 scopus 로고
    • Neurocontrol of nonlinear dynamical systems with Kalman filter trained recurrent networks
    • G. V. Puskorius and L. A. Feldkamp, "Neurocontrol of nonlinear dynamical systems with Kalman filter trained recurrent networks", IEEE Transactions on Neural Networks, 5(2):279-297 (1994).
    • (1994) IEEE Transactions on Neural Networks , vol.5 , Issue.2 , pp. 279-297
    • Puskorius, G.V.1    Feldkamp, L.A.2
  • 3
    • 0005316958 scopus 로고
    • Induction of multiscale temporal structure
    • ed. J. E. Moody et al., Morgan Kaufmann, San Mateo
    • M. C. Mozer, "Induction of multiscale temporal structure", in Advances in Neural Information Processing Systems 4, ed. J. E. Moody et al., (Morgan Kaufmann, San Mateo, 1992), pages 275-282.
    • (1992) Advances in Neural Information Processing Systems 4 , pp. 275-282
    • Mozer, M.C.1
  • 4
    • 0001765578 scopus 로고
    • Gradient-based learning algorithms for recurrent networks and their computational complexity
    • ed. Y. Chauvin and D. E. Rumelhart Hillsdale, Erlbaum, New York
    • R. J. Williams and D. Zipser, "Gradient-based learning algorithms for recurrent networks and their computational complexity", in Back-propagation: Theory, Architectures and Applications, ed. Y. Chauvin and D. E. Rumelhart (Hillsdale, Erlbaum, New York, 1992).
    • (1992) Back-propagation: Theory, Architectures and Applications
    • Williams, R.J.1    Zipser, D.2
  • 6
    • 0028392483 scopus 로고
    • Learning long-term dependencies with gradient descent is difficult
    • Y. Bengio, P. Simard, and P. Frasconi, "Learning long-term dependencies with gradient descent is difficult", IEEE Transactions on Neural Networks, 5(2):157-166 (1994).
    • (1994) IEEE Transactions on Neural Networks , vol.5 , Issue.2 , pp. 157-166
    • Bengio, Y.1    Simard, P.2    Frasconi, P.3
  • 7
    • 0003575034 scopus 로고
    • Diploma Thesis, Institut für Informatik, Lehrstuhl Prof. Brauer, Technische Univ. München
    • S. Hochreiter, "Untersuchungen zu dynamischen neuronalen Netzen", Diploma Thesis, Institut für Informatik, Lehrstuhl Prof. Brauer, Technische Univ. München, 1991. See www7.informatik.tu-muenchen.de/∼hochreit.
    • (1991) Untersuchungen zu Dynamischen Neuronalen Netzen
    • Hochreiter, S.1
  • 8
    • 0029013513 scopus 로고
    • Nonlinear higher-order statistical decorrelation by volume-conserving neural architectures
    • G. Deco and W. Brauer, "Nonlinear higher-order statistical decorrelation by volume-conserving neural architectures", Neural Networks, 8(4):525-535 (1995).
    • (1995) Neural Networks , vol.8 , Issue.4 , pp. 525-535
    • Deco, G.1    Brauer, W.2
  • 9
    • 0000599735 scopus 로고
    • Dynamics and architecture for neural computation
    • F. J. Pineda, "Dynamics and architecture for neural computation", Journal of Complexity, 4:216-245 (1988).
    • (1988) Journal of Complexity , vol.4 , pp. 216-245
    • Pineda, F.J.1
  • 10
    • 0000370416 scopus 로고    scopus 로고
    • LSTM can solve hard long time lag problems
    • ed. M. C. Mozer et al. MIT Press, Cambridge MA
    • S. Hochreiter and J. Schmidhuber, "LSTM can solve hard long time lag problems", in Advances in Neural Information Processing Systems 9, ed. M. C. Mozer et al. (MIT Press, Cambridge MA, 1997), pages 473-479.
    • (1997) Advances in Neural Information Processing Systems 9 , pp. 473-479
    • Hochreiter, S.1    Schmidhuber, J.2
  • 11
    • 0346176694 scopus 로고    scopus 로고
    • Bridging long time lags by weight guessing and Long Short-Term Memory
    • ed. F. L. Silva et al. IOS Press, Amsterdam, Netherlands
    • S. Hochreiter and J. Schmidhuber, "Bridging long time lags by weight guessing and Long Short-Term Memory", in Spatiotemporal models in biological and artificial systems, ed. F. L. Silva et al. (IOS Press, Amsterdam, Netherlands, 1996), pages 65-72.
    • (1996) Spatiotemporal Models in Biological and Artificial Systems , pp. 65-72
    • Hochreiter, S.1    Schmidhuber, J.2
  • 12
    • 0004262806 scopus 로고
    • Technical Report CRL 8801, Center for Research in Language, Univ. of California, San Diego
    • J. L. Elman, "Finding structure in time", Technical Report CRL 8801, Center for Research in Language, Univ. of California, San Diego, 1988.
    • (1988) Finding Structure in Time
    • Elman, J.L.1
  • 13
    • 0001086881 scopus 로고
    • The recurrent cascade-correlation learning algorithm
    • ed. R. P. Lippmann et al. Morgan Kaufmann, San Mateo
    • S. E. Fahlman, "The recurrent cascade-correlation learning algorithm", in Advances in Neural Information Processing Systems 3, ed. R. P. Lippmann et al. (Morgan Kaufmann, San Mateo, 1991), pages 190-196.
    • (1991) Advances in Neural Information Processing Systems 3 , pp. 190-196
    • Fahlman, S.E.1
  • 15
    • 0000053463 scopus 로고
    • 3) time complexity learning algorithm for fully recurrent continually running networks
    • 3) time complexity learning algorithm for fully recurrent continually running networks", Neural Computation, 4(2):243-248 (1992).
    • (1992) Neural Computation , vol.4 , Issue.2 , pp. 243-248
    • Schmidhuber, J.1
  • 16
    • 0001202597 scopus 로고
    • Learning state space trajectories in recurrent neural networks
    • B. A. Pearlmutter, "Learning state space trajectories in recurrent neural networks", Neural Computation, 1(2):263-269 (1989).
    • (1989) Neural Computation , vol.1 , Issue.2 , pp. 263-269
    • Pearlmutter, B.A.1
  • 17
    • 0029375851 scopus 로고
    • Gradient calculations for dynamic recurrent neural networks: A survey
    • B. A. Pearlmutter, "Gradient calculations for dynamic recurrent neural networks: A survey", IEEE Transactions on Neural Networks, 6(5):1212-1228 (1995).
    • (1995) IEEE Transactions on Neural Networks , vol.6 , Issue.5 , pp. 1212-1228
    • Pearlmutter, B.A.1
  • 20
    • 0000971250 scopus 로고
    • Credit assignment through time: Alternatives to backpropagation
    • ed. J. D. Cowan et al. Morgan Kaufmann, San Mateo
    • Y. Bengio and P. Frasconi, "Credit assignment through time: Alternatives to backpropagation", in Advances in Neural Information Processing Systems 6, ed. J. D. Cowan et al. (Morgan Kaufmann, San Mateo, 1994), pages 75-82.
    • (1994) Advances in Neural Information Processing Systems 6 , pp. 75-82
    • Bengio, Y.1    Frasconi, P.2
  • 21
    • 0001033889 scopus 로고
    • Learning complex, extended sequences using the principle of history compression
    • J. Schmidhuber, "Learning complex, extended sequences using the principle of history compression", Neural Computation, 4(2):234-242 (1992).
    • (1992) Neural Computation , vol.4 , Issue.2 , pp. 234-242
    • Schmidhuber, J.1
  • 22
    • 0001601299 scopus 로고
    • Induction of finite-state languages using second-order recurrent networks
    • R. L. Watrous and G. M. Kuhn, "Induction of finite-state languages using second-order recurrent networks", Neural Computation, 4:406-414 (1992).
    • (1992) Neural Computation , vol.4 , pp. 406-414
    • Watrous, R.L.1    Kuhn, G.M.2
  • 24
    • 0025254722 scopus 로고
    • A time-delay neural network architecture for isolated word recognition
    • K. Lang, A. Waibel, and G. E. Hinton, "A time-delay neural network architecture for isolated word recognition", Neural Networks, 3:23-43 (1990).
    • (1990) Neural Networks , vol.3 , pp. 23-43
    • Lang, K.1    Waibel, A.2    Hinton, G.E.3
  • 25
    • 33646241633 scopus 로고    scopus 로고
    • Learning long-term dependencies in NARX recurrent neural networks
    • T. Lin, B. G. Horne, P. Tino, and C. L. Giles, "Learning long-term dependencies in NARX recurrent neural networks", IEEE Transactions on Neural Networks, 7(6):1329-1338 (1996).
    • (1996) IEEE Transactions on Neural Networks , vol.7 , Issue.6 , pp. 1329-1338
    • Lin, T.1    Horne, B.G.2    Tino, P.3    Giles, C.L.4
  • 26
    • 0009382953 scopus 로고
    • Holographic recurrent networks
    • ed. J. D. Cowan et al. Morgan Kaufmann, San Mateo
    • T. A. Plate, "Holographic recurrent networks", in Advances in Neural Information Processing Systems 5, ed. J. D. Cowan et al. (Morgan Kaufmann, San Mateo, 1993), pages 34-41.
    • (1993) Advances in Neural Information Processing Systems 5 , pp. 34-41
    • Plate, T.A.1
  • 27
    • 0343449995 scopus 로고
    • A theory for neural networks with time delays
    • ed. R. P. Lippmann et al. Morgan Kaufmann, San Mateo
    • B. de Vries and J. C. Principe, "A theory for neural networks with time delays", in Advances in Neural Information Processing Systems 3, ed. R. P. Lippmann et al. (Morgan Kaufmann, San Mateo, 1991), pages 162-168.
    • (1991) Advances in Neural Information Processing Systems 3 , pp. 162-168
    • De Vries, B.1    Principe, J.C.2
  • 28
    • 0000651310 scopus 로고
    • Time warping invariant neural networks
    • ed. J. D. Cowan et al. Morgan Kaufmann, San Mateo
    • G. Sun, H. Chen, and Y. Lee, "Time warping invariant neural networks", in Advances in Neural Information Processing Systems 5, ed. J. D. Cowan et al. (Morgan Kaufmann, San Mateo, 1993), pages 180-187.
    • (1993) Advances in Neural Information Processing Systems 5 , pp. 180-187
    • Sun, G.1    Chen, H.2    Lee, Y.3
  • 29
    • 0001274675 scopus 로고
    • Learning sequential structures with the real-time recurrent learning algorithm
    • A. W. Smith and D. Zipser, "Learning sequential structures with the real-time recurrent learning algorithm", International Journal of Neural Systems, 1(2):125-131 (1989).
    • (1989) International Journal of Neural Systems , vol.1 , Issue.2 , pp. 125-131
    • Smith, A.W.1    Zipser, D.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.