메뉴 건너뛰기




Volumn , Issue , 2010, Pages 3060-3065

A model-free robust policy iteration algorithm for optimal control of nonlinear systems

Author keywords

[No Author keywords available]

Indexed keywords

CONTINUOUS TIME SYSTEMS; GRADIENT METHODS; NEURAL NETWORKS; NONLINEAR SYSTEMS; ONLINE SYSTEMS; OPTIMAL CONTROL SYSTEMS; OPTIMIZATION;

EID: 79953151751     PISSN: 07431546     EISSN: 25762370     Source Type: Conference Proceeding    
DOI: 10.1109/CDC.2010.5717295     Document Type: Conference Paper
Times cited : (23)

References (34)
  • 1
    • 33847202724 scopus 로고
    • Learning to predict by the methods of temporal differences
    • R. Sutton, "Learning to predict by the methods of temporal differences," Mach. Learn., vol. 3, no. 1, pp. 9-44, 1988.
    • (1988) Mach. Learn. , vol.3 , Issue.1 , pp. 9-44
    • Sutton, R.1
  • 3
    • 0015667648 scopus 로고
    • Punish/reward: Learning with a critic in adaptive threshold systems
    • B. Widrow, N. Gupta, and S. Maitra, "Punish/reward: Learning with a critic in adaptive threshold systems," IEEE Trans. Syst. Man Cybern., vol. 3, no. 5, pp. 455-465, 1973.
    • (1973) IEEE Trans. Syst. Man Cybern. , vol.3 , Issue.5 , pp. 455-465
    • Widrow, B.1    Gupta, N.2    Maitra, S.3
  • 4
    • 0020970738 scopus 로고
    • Neuronlike adaptive elements that can solve difficult learning control problems
    • A. Barto, R. Sutton, and C. Anderson, "Neuronlike adaptive elements that can solve difficult learning control problems," IEEE Trans. Syst. Man Cybern., vol. 13, no. 5, pp. 834-846, 1983.
    • (1983) IEEE Trans. Syst. Man Cybern. , vol.13 , Issue.5 , pp. 834-846
    • Barto, A.1    Sutton, R.2    Anderson, C.3
  • 5
    • 0026852362 scopus 로고
    • Reinforcement learning is direct adaptive optimal control
    • R. Sutton, A. Barto, and R. Williams, "Reinforcement learning is direct adaptive optimal control," IEEE Contr. Syst. Mag., vol. 12, no. 2, pp. 19-22, 1992.
    • (1992) IEEE Contr. Syst. Mag. , vol.12 , Issue.2 , pp. 19-22
    • Sutton, R.1    Barto, A.2    Williams, R.3
  • 7
    • 0002011091 scopus 로고
    • A menu of designs for reinforcement learning over time
    • ser. MIT Press Series In Neural Network Modeling And Connectionism. Cambridge, MA, USA: MIT Press
    • P. Webros, "A menu of designs for reinforcement learning over time," in Neural networks for control, ser. MIT Press Series In Neural Network Modeling And Connectionism. Cambridge, MA, USA: MIT Press, 1990, pp. 67-95.
    • (1990) Neural Networks for Control , pp. 67-95
    • Webros, P.1
  • 9
    • 0031236002 scopus 로고    scopus 로고
    • Adaptive critic designs
    • D. C. Sep.
    • D. V. Prokhorov and I. Wunsch, D. C., "Adaptive critic designs," IEEE Trans. Neural Networks, vol. 8, no. 5, pp. 997-1007, Sep. 1997.
    • (1997) IEEE Trans. Neural Networks , vol.8 , Issue.5 , pp. 997-1007
    • Prokhorov, D.V.1    Wunsch, I.2
  • 11
    • 0030196717 scopus 로고    scopus 로고
    • Adaptive-critic-based neural networks for aircraft optimal control
    • S. Balakrishnan, "Adaptive-critic-based neural networks for aircraft optimal control," J. Guid. Contr. Dynam., vol. 19, no. 4, pp. 893-898, 1996.
    • (1996) J. Guid. Contr. Dynam. , vol.19 , Issue.4 , pp. 893-898
    • Balakrishnan, S.1
  • 12
    • 0033685661 scopus 로고    scopus 로고
    • Adaptive critic design for intelligent steering and speed control of a 2-axle vehicle
    • G. Lendaris, L. Schultz, and T. Shannon, "Adaptive critic design for intelligent steering and speed control of a 2-axle vehicle," in Int. Joint Conf. Neural Netw., 2000, pp. 73-78.
    • (2000) Int. Joint Conf. Neural Netw. , pp. 73-78
    • Lendaris, G.1    Schultz, L.2    Shannon, T.3
  • 14
    • 0036641793 scopus 로고    scopus 로고
    • State-constrained agile missile control with adaptive-critic-based neural networks
    • D. Han and S. Balakrishnan, "State-constrained agile missile control with adaptive-critic-based neural networks," IEEE Trans. Control Syst. Technol., vol. 10, no. 4, pp. 481-489, 2002.
    • (2002) IEEE Trans. Control Syst. Technol. , vol.10 , Issue.4 , pp. 481-489
    • Han, D.1    Balakrishnan, S.2
  • 15
    • 34047138362 scopus 로고    scopus 로고
    • Reinforcement learning neural-network-based controller for nonlinear discrete-time systems with input constraints
    • P. He and S. Jagannathan, "Reinforcement learning neural-network-based controller for nonlinear discrete-time systems with input constraints," IEEE Trans. Syst. Man Cybern. Part B Cybern., vol. 37, no. 2, pp. 425-436, 2007.
    • (2007) IEEE Trans. Syst. Man Cybern. Part B Cybern. , vol.37 , Issue.2 , pp. 425-436
    • He, P.1    Jagannathan, S.2
  • 16
    • 49049089962 scopus 로고    scopus 로고
    • Discrete-time nonlinear HJB solution using approximate dynamic programming: Convergence proof
    • Aug.
    • A. Al-Tamimi, F. L. Lewis, and M. Abu-Khalaf, "Discrete-time nonlinear HJB solution using approximate dynamic programming: Convergence proof," IEEE Trans. Syst. Man Cybern. Part B Cybern., vol. 38, no. 4, pp. 943-949, Aug. 2008.
    • (2008) IEEE Trans. Syst. Man Cybern. Part B Cybern. , vol.38 , Issue.4 , pp. 943-949
    • Al-Tamimi, A.1    Lewis, F.L.2    Abu-Khalaf, M.3
  • 17
    • 0004370245 scopus 로고
    • Wright Lab, Wright-Patterson Air Force Base, OH, Tech. Rep.
    • L. Baird, "Advantage updating," Wright Lab, Wright-Patterson Air Force Base, OH, Tech. Rep., 1993.
    • (1993) Advantage Updating
    • Baird, L.1
  • 18
    • 0033629916 scopus 로고    scopus 로고
    • Reinforcement learning in continuous time and space
    • K. Doya, "Reinforcement learning in continuous time and space," Neural Comput., vol. 12, no. 1, pp. 219-245, 2000.
    • (2000) Neural Comput. , vol.12 , Issue.1 , pp. 219-245
    • Doya, K.1
  • 20
    • 0031332446 scopus 로고    scopus 로고
    • Galerkin approximations of the generalized Hamilton-Jacobi-Bellman equation
    • R. Beard, G. Saridis, and J. Wen, "Galerkin approximations of the generalized Hamilton-Jacobi-Bellman equation," Automatica, vol. 33, pp. 2159-2178, 1997.
    • (1997) Automatica , vol.33 , pp. 2159-2178
    • Beard, R.1    Saridis, G.2    Wen, J.3
  • 21
    • 14844340822 scopus 로고    scopus 로고
    • Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network HJB approach
    • M. Abu-Khalaf and F. Lewis, "Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network HJB approach," Automatica, vol. 41, no. 5, pp. 779-791, 2005.
    • (2005) Automatica , vol.41 , Issue.5 , pp. 779-791
    • Abu-Khalaf, M.1    Lewis, F.2
  • 23
    • 67349145396 scopus 로고    scopus 로고
    • Neural network approach to continuoustime direct adaptive optimal control for partially unknown nonlinear systems
    • D. Vrabie and F. Lewis, "Neural network approach to continuoustime direct adaptive optimal control for partially unknown nonlinear systems," Neural Networks, vol. 22, no. 3, pp. 237-246, 2009.
    • (2009) Neural Networks , vol.22 , Issue.3 , pp. 237-246
    • Vrabie, D.1    Lewis, F.2
  • 24
  • 25
    • 0002031779 scopus 로고
    • Approximate dynamic programming for real-time control and neural modeling
    • D. A. White and D. A. Sofge, Eds. New York: Van Nostrand Reinhold
    • P. Werbos, "Approximate dynamic programming for real-time control and neural modeling," in Handbook of Intelligent Control: Neural, Fuzzy, and Adaptive Approaches, D. A. White and D. A. Sofge, Eds. New York: Van Nostrand Reinhold, 1992.
    • (1992) Handbook of Intelligent Control: Neural, Fuzzy, and Adaptive Approaches
    • Werbos, P.1
  • 26
    • 0004469897 scopus 로고
    • Neurons with graded response have collective computational properties like those of two-state neurons
    • J. Hopfield, "Neurons with graded response have collective computational properties like those of two-state neurons," Proc. Nat. Acad. Sci. U.S.A., vol. 81, no. 10, p. 3088, 1984.
    • (1984) Proc. Nat. Acad. Sci. U.S.A. , vol.81 , Issue.10 , pp. 3088
    • Hopfield, J.1
  • 28
    • 0003581164 scopus 로고
    • Identification and control of nonlinear systems using neural network models: Design and stability analysis
    • University of Southern California
    • M. Polycarpou and P. Ioannou, "Identification and control of nonlinear systems using neural network models: Design and stability analysis," Systems Report 91-09-01, University of Southern California, 1991.
    • (1991) Systems Report 91-09-01
    • Polycarpou, M.1    Ioannou, P.2
  • 29
    • 0024861871 scopus 로고
    • Approximation by superpositions of a sigmoidal function
    • G. Cybenko, "Approximation by superpositions of a sigmoidal function," Math. Control Signals Syst., vol. 2, pp. 303-314, 1989.
    • (1989) Math. Control Signals Syst. , vol.2 , pp. 303-314
    • Cybenko, G.1
  • 30
    • 0000466705 scopus 로고    scopus 로고
    • Nonlinear network structures for feedback control
    • F. L. Lewis, "Nonlinear network structures for feedback control," Asian J. Control, vol. 1, no. 4, pp. 205-228, 1999.
    • (1999) Asian J. Control , vol.1 , Issue.4 , pp. 205-228
    • Lewis, F.L.1
  • 34
    • 0004178386 scopus 로고    scopus 로고
    • 3rd ed. Upper Saddle River, NJ: Prentice Hall
    • H. K. Khalil, Nonlinear Systems, 3rd ed. Upper Saddle River, NJ: Prentice Hall, 2002.
    • (2002) Nonlinear Systems
    • Khalil, H.K.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.