메뉴 건너뛰기




Volumn 30, Issue 3-4, 2008, Pages 207-223

Biologically inspired scheme for continuous-time approximate dynamic programming

Author keywords

Actor Critic structures; approximate dynamic programming; linear quadratic regulation; policy iteration

Indexed keywords

BIOMIMETICS; CONTINUOUS TIME SYSTEMS; CONTROL SYSTEM ANALYSIS; CONTROLLERS; DYNAMIC PROGRAMMING; ITERATIVE METHODS; LEARNING SYSTEMS; LINEAR SYSTEMS; MAN MACHINE SYSTEMS; NONLINEAR SYSTEMS; REINFORCEMENT LEARNING; RICCATI EQUATIONS; ROBUST CONTROL; STATE FEEDBACK; STATE SPACE METHODS; SYSTEM THEORY;

EID: 47949083966     PISSN: 01423312     EISSN: None     Source Type: Journal    
DOI: 10.1177/0142331207088188     Document Type: Article
Times cited : (13)

References (29)
  • 2
    • 33846781129 scopus 로고    scopus 로고
    • Model-free Q-learning designs for discrete-time zero-sum games with application to H-Infinity control
    • Al-Tamimi, A., Abu-Khalaf, M. and Lewis, F.L. 2007b: Model-free Q-learning designs for discrete-time zero-sum games with application to H-Infinity control. Automatica 43, 473-82.
    • (2007) Automatica , vol.43 , pp. 473-482
    • Al-Tamimi, A.1    Abu-Khalaf, M.2    Lewis, F.L.3
  • 3
    • 0028733775 scopus 로고
    • Reinforcement learning in continuous-time: advantage updating
    • Proceedings of the International Conference on Neural Networks, Orlando, FL, June
    • Baird, L. 1994: Reinforcement learning in continuous-time: advantage updating. Proceedings of the International Conference on Neural Networks, Orlando, FL, June.
    • (1994)
    • Baird, L.1
  • 6
    • 84980552700 scopus 로고    scopus 로고
    • Dynamic programming and suboptimal control: a survey from ADP to MPC
    • Proceeding of CDC'05.
    • Bertsekas, D.P. 2005: Dynamic programming and suboptimal control: a survey from ADP to MPC. Proceeding of CDC'05.
    • (2005)
    • Bertsekas, D.P.1
  • 7
    • 0028584964 scopus 로고
    • Adaptive linear quadratic control using policy iteration
    • Proceedings of the American Control Conference, Baltmore, MD, June, 3475-76
    • Bradtke, S.J., Ydestie, B.E. and Barto, A.G. 1994: Adaptive linear quadratic control using policy iteration. Proceedings of the American Control Conference, Baltmore, MD, June, 3475-76.
    • (1994)
    • Bradtke, S.J.1    Ydestie, B.E.2    Barto, A.G.3
  • 8
  • 9
    • 85156231814 scopus 로고    scopus 로고
    • Temporal difference learning in continuous-time and space
    • Doya, K. 1996: Temporal difference learning in continuous-time and space. Advances in Neural Information Processing Systems 8, 1073-79.
    • (1996) Advances in Neural Information Processing Systems , vol.8 , pp. 1073-1079
    • Doya, K.1
  • 10
    • 0033629916 scopus 로고    scopus 로고
    • Reinforcement learning in continuous-time and space
    • Doya, K. 2000: Reinforcement learning in continuous-time and space. Neural Computation 12, 219-45.
    • (2000) Neural Computation , vol.12 , pp. 219-245
    • Doya, K.1
  • 11
    • 0036060633 scopus 로고    scopus 로고
    • An adaptive critic global controller
    • Proceedings of the American Control Conference, Anchorage, AK, 2665-70
    • Ferrari, S. and Stengel, R. 2002: An adaptive critic global controller. Proceedings of the American Control Conference, Anchorage, AK, 2665-70.
    • (2002)
    • Ferrari, S.1    Stengel, R.2
  • 14
    • 84914965022 scopus 로고
    • On an iterative technique for Riccati equation computations
    • Kleinman, D. 1968: On an iterative technique for Riccati equation computations. IEEE Transactions on Automatic Control 13, 114-15.
    • (1968) IEEE Transactions on Automatic Control , vol.13 , pp. 114-115
    • Kleinman, D.1
  • 18
    • 0034849306 scopus 로고    scopus 로고
    • Adaptive critic based neuro-observer
    • Proceedings of the American Control Conference, Arlington, VA, 1616-21
    • Liu, X. and Balakrishnan, S.N. 2001: Adaptive critic based neuro-observer. Proceedings of the American Control Conference, Arlington, VA, 1616-21.
    • (2001)
    • Liu, X.1    Balakrishnan, S.N.2
  • 20
    • 0036588686 scopus 로고    scopus 로고
    • Adaptive dynamic programming. IEEE Transaction on Systems
    • Murray, J., Cox, C., Lendaris, G. and Saeks, R. 2002: Adaptive dynamic programming. IEEE Transaction on Systems, Man, and Cybernetics 32, 140-53.
    • (2002) Man, and Cybernetics , vol.32 , pp. 140-153
    • Murray, J.1    Cox, C.2    Lendaris, G.3    Saeks, R.4
  • 22
    • 0028969330 scopus 로고
    • Visual feature integration and the temporal correlation hypothesis
    • Singer, W. and Gray, C.M. 1995: Visual feature integration and the temporal correlation hypothesis. Annual Review of Neuroscience 18, 555-86.
    • (1995) Annual Review of Neuroscience , vol.18 , pp. 555-586
    • Singer, W.1    Gray, C.M.2
  • 23
    • 47949105735 scopus 로고    scopus 로고
    • The interplay of intrinsic and synaptic membrane currents in delta, theta and 40-Hz oscillations
    • In Levine, D.S., Brown, V.R. & Shirey, V.T., editors. Lawrence Erlbaum Associates.
    • Soltesz, I. 2000: The interplay of intrinsic and synaptic membrane currents in delta, theta and 40-Hz oscillations. In Levine, D.S., Brown, V.R. & Shirey, V.T., editors. Oscillations in neural systems. Lawrence Erlbaum Associates.
    • (2000) Oscillations in neural systems
    • Soltesz, I.1
  • 24
    • 33847202724 scopus 로고
    • Learning to predict by the method of temporal differences
    • Sutton, R.S. 1988: Learning to predict by the method of temporal differences. Machine Learning 3, 9-44.
    • (1988) Machine Learning , vol.3 , pp. 9-44
    • Sutton, R.S.1
  • 26
    • 34548721141 scopus 로고    scopus 로고
    • Continuous-time ADP for linear systems with partially unknown dynamics
    • Proceedings of the IEEE International Symposium on Approximate Dynamic Programming and Reinforcement Learning, 247-53
    • Vrabie, D., Abu-Khalaf, M., Lewis, F.L. and Wang, Youyi. 2007: Continuous-time ADP for linear systems with partially unknown dynamics. Proceedings of the IEEE International Symposium on Approximate Dynamic Programming and Reinforcement Learning, 247-53.
    • (2007)
    • Vrabie, D.1    Abu-Khalaf, M.2    Lewis, F.L.3    Wang, Y.4
  • 28
    • 0002031779 scopus 로고
    • Approximate dynamic programming for real-time control and neural modeling
    • In White, D.A. and Sofge, D.A., editors. Van Nostrand.
    • Werbos, P.J. 1992: Approximate dynamic programming for real-time control and neural modeling. In White, D.A. and Sofge, D.A., editors. Handbook of intelligent control. Van Nostrand.
    • (1992) Handbook of intelligent control
    • Werbos, P.J.1
  • 29
    • 47949095751 scopus 로고    scopus 로고
    • Optimization: a foundation for understanding consciousness
    • In Levine, D.S. and Elsberry, W.R., editors Lawrence Erlbaum Associates.
    • Werbos, P.J. 1997: Optimization: a foundation for understanding consciousness. In Levine, D.S. and Elsberry, W.R., editors, Optimality in biological and artificial networks? Lawrence Erlbaum Associates.
    • (1997) Optimality in biological and artificial networks?
    • Werbos, P.J.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.