메뉴 건너뛰기




Volumn 55, Issue , 2014, Pages 30-41

Discrete-time online learning control for a class of unknown nonaffine nonlinear systems using reinforcement learning

Author keywords

Adaptive critic design; Neural network; Nonaffine nonlinear system; Online learning; Reinforcement learning

Indexed keywords

NEURAL NETWORKS; NONLINEAR SYSTEMS; REINFORCEMENT LEARNING; TELECOMMUNICATION NETWORKS;

EID: 84897950099     PISSN: 08936080     EISSN: 18792782     Source Type: Journal    
DOI: 10.1016/j.neunet.2014.03.008     Document Type: Article
Times cited : (58)

References (52)
  • 3
    • 85012688561 scopus 로고
    • Princeton University Press, Princeton, New Jersey
    • Bellman R.E. Dynamic programming 1957, Princeton University Press, Princeton, New Jersey.
    • (1957) Dynamic programming
    • Bellman, R.E.1
  • 5
    • 84871319455 scopus 로고    scopus 로고
    • A novel actor-critic-identifier architecture for approximate optimal control of uncertain nonlinear systems
    • Bhasin S., Kamalapurkar R., Johnson M., Vamvoudakis K.G., Lewis F.L., Dixon W.E. A novel actor-critic-identifier architecture for approximate optimal control of uncertain nonlinear systems. Automatica 2013, 49(1):82-92.
    • (2013) Automatica , vol.49 , Issue.1 , pp. 82-92
    • Bhasin, S.1    Kamalapurkar, R.2    Johnson, M.3    Vamvoudakis, K.G.4    Lewis, F.L.5    Dixon, W.E.6
  • 6
    • 84870272327 scopus 로고    scopus 로고
    • Output feedback direct adaptive neural network control for uncertain SISO nonlinear systems using a fuzzy estimator of the control error
    • Chemachema M. Output feedback direct adaptive neural network control for uncertain SISO nonlinear systems using a fuzzy estimator of the control error. Neural Networks 2012, 36:25-34.
    • (2012) Neural Networks , vol.36 , pp. 25-34
    • Chemachema, M.1
  • 7
    • 0029308580 scopus 로고
    • Adaptive control of a class of nonlinear discrete-time systems using neural networks
    • Chen F.C., Khalil H.K. Adaptive control of a class of nonlinear discrete-time systems using neural networks. IEEE Transactions on Automatic Control 1995, 40(5):791-801.
    • (1995) IEEE Transactions on Automatic Control , vol.40 , Issue.5 , pp. 791-801
    • Chen, F.C.1    Khalil, H.K.2
  • 8
    • 52149111148 scopus 로고    scopus 로고
    • Feedback-linearization-based neural adaptive control for unknown nonaffine nonlinear discrete-time systems
    • Deng H., Li H.X., Wu Y.H. Feedback-linearization-based neural adaptive control for unknown nonaffine nonlinear discrete-time systems. IEEE Transactions on Neural Networks 2008, 19(9):1615-1625.
    • (2008) IEEE Transactions on Neural Networks , vol.19 , Issue.9 , pp. 1615-1625
    • Deng, H.1    Li, H.X.2    Wu, Y.H.3
  • 13
    • 0036858905 scopus 로고    scopus 로고
    • Adaptive output feedback control of uncertain nonlinear systems using single-hidden-layer neural networks
    • Hovakimyan N., Nardi F., Calise A., Kim N. Adaptive output feedback control of uncertain nonlinear systems using single-hidden-layer neural networks. IEEE Transactions on Neural Networks 2002, 13(6):1420-1431.
    • (2002) IEEE Transactions on Neural Networks , vol.13 , Issue.6 , pp. 1420-1431
    • Hovakimyan, N.1    Nardi, F.2    Calise, A.3    Kim, N.4
  • 14
    • 0029403793 scopus 로고
    • Stochastic choice of basis functions in adaptive function approximation and the function-link net
    • Igelnik B., Pao Y.H. Stochastic choice of basis functions in adaptive function approximation and the function-link net. IEEE Transactions on Neural Networks 1995, 6(6):1320-1329.
    • (1995) IEEE Transactions on Neural Networks , vol.6 , Issue.6 , pp. 1320-1329
    • Igelnik, B.1    Pao, Y.H.2
  • 17
    • 79551685808 scopus 로고    scopus 로고
    • Reinforcement learning for partially observable dynamic processes: adaptive dynamic programming using measured output data
    • Lewis F.L., Vamvoudakis K.G. Reinforcement learning for partially observable dynamic processes: adaptive dynamic programming using measured output data. IEEE Transactions on Systems, Man, Cybernetics, Part B, Cybernetics 2011, 41(1):14-25.
    • (2011) IEEE Transactions on Systems, Man, Cybernetics, Part B, Cybernetics , vol.41 , Issue.1 , pp. 14-25
    • Lewis, F.L.1    Vamvoudakis, K.G.2
  • 18
    • 84883537695 scopus 로고    scopus 로고
    • Reinforcement learning and feedback control: using natural decision methods to design optimal adaptive controllers
    • Lewis F.L., Vrabie D., Vamvoudakis K.G. Reinforcement learning and feedback control: using natural decision methods to design optimal adaptive controllers. IEEE Control Systems 2012, 32(6):76-105.
    • (2012) IEEE Control Systems , vol.32 , Issue.6 , pp. 76-105
    • Lewis, F.L.1    Vrabie, D.2    Vamvoudakis, K.G.3
  • 19
    • 0030108041 scopus 로고    scopus 로고
    • Multilayer neural-net robot controller with guaranteed tracking performance
    • Lewis F.L., Yesildirek A., Liu K. Multilayer neural-net robot controller with guaranteed tracking performance. IEEE Transactions on Neural Networks 1996, 7(2):388-399.
    • (1996) IEEE Transactions on Neural Networks , vol.7 , Issue.2 , pp. 388-399
    • Lewis, F.L.1    Yesildirek, A.2    Liu, K.3
  • 20
    • 0038355072 scopus 로고    scopus 로고
    • Design of an adaptive neural network based power system stabilizer
    • Liu W., Venayagamoorthy G.K., Wunsch D.C. Design of an adaptive neural network based power system stabilizer. Neural Networks 2003, 16(5-6):891-898.
    • (2003) Neural Networks , vol.16 , Issue.5-6 , pp. 891-898
    • Liu, W.1    Venayagamoorthy, G.K.2    Wunsch, D.C.3
  • 21
    • 84868467610 scopus 로고    scopus 로고
    • An iterative adaptive dynamic programming algorithm for optimal control of unknown discrete-time nonlinear systems with constrained inputs
    • Liu D., Wang D., Yang X. An iterative adaptive dynamic programming algorithm for optimal control of unknown discrete-time nonlinear systems with constrained inputs. Information Sciences 2013, 220(20):331-342.
    • (2013) Information Sciences , vol.220 , Issue.20 , pp. 331-342
    • Liu, D.1    Wang, D.2    Yang, X.3
  • 22
    • 84881555023 scopus 로고    scopus 로고
    • Finite-approximation-error-based optimal control approach for discrete-time nonlinear systems
    • Liu D., Wei Q. Finite-approximation-error-based optimal control approach for discrete-time nonlinear systems. IEEE Transactions on Cybernetics 2013, 43(2):779-789.
    • (2013) IEEE Transactions on Cybernetics , vol.43 , Issue.2 , pp. 779-789
    • Liu, D.1    Wei, Q.2
  • 23
    • 84887472008 scopus 로고    scopus 로고
    • Adaptive optimal control for a class of continuous-time affine nonlinear systems with unknown internal dynamics
    • Liu D., Yang X., Li H. Adaptive optimal control for a class of continuous-time affine nonlinear systems with unknown internal dynamics. Neural Computing and Applications 2013, 23(7-8):1843-1850.
    • (2013) Neural Computing and Applications , vol.23 , Issue.7-8 , pp. 1843-1850
    • Liu, D.1    Yang, X.2    Li, H.3
  • 24
    • 26844483839 scopus 로고    scopus 로고
    • A self-learning call admission control scheme for CDMA cellular networks
    • Liu D., Zhang Y., Zhang H. A self-learning call admission control scheme for CDMA cellular networks. IEEE Transactions on Neural Networks 2005, 16(5):1219-1228.
    • (2005) IEEE Transactions on Neural Networks , vol.16 , Issue.5 , pp. 1219-1228
    • Liu, D.1    Zhang, Y.2    Zhang, H.3
  • 26
    • 8444251345 scopus 로고    scopus 로고
    • Feedback error learning and nonlinear adaptive control
    • Nakanishi J., Schaal S. Feedback error learning and nonlinear adaptive control. Neural Networks 2004, 17(10):1453-1465.
    • (2004) Neural Networks , vol.17 , Issue.10 , pp. 1453-1465
    • Nakanishi, J.1    Schaal, S.2
  • 27
    • 0028137961 scopus 로고
    • Adaptive control of nonlinear multivariable systems using neural networks
    • Narendra K.S., Mukhopadhyay S. Adaptive control of nonlinear multivariable systems using neural networks. Neural Networks 1994, 7(5):737-752.
    • (1994) Neural Networks , vol.7 , Issue.5 , pp. 737-752
    • Narendra, K.S.1    Mukhopadhyay, S.2
  • 28
    • 0031672251 scopus 로고    scopus 로고
    • A direct adaptive neural-network control for unknown nonlinear systems and its applications
    • Noriega J.R., Wang H. A direct adaptive neural-network control for unknown nonlinear systems and its applications. IEEE Transactions on Neural Networks 1998, 9(1):27-34.
    • (1998) IEEE Transactions on Neural Networks , vol.9 , Issue.1 , pp. 27-34
    • Noriega, J.R.1    Wang, H.2
  • 29
    • 33751238181 scopus 로고    scopus 로고
    • A single network adaptive critic (SNAC) architecture for optimal control synthesis for a class of nonlinear systems
    • Padhi R., Unnikrishnan N., Wang X., Balakrishnan S.N. A single network adaptive critic (SNAC) architecture for optimal control synthesis for a class of nonlinear systems. Neural Networks 2006, 19(10):1648-1660.
    • (2006) Neural Networks , vol.19 , Issue.10 , pp. 1648-1660
    • Padhi, R.1    Unnikrishnan, N.2    Wang, X.3    Balakrishnan, S.N.4
  • 30
    • 15344350446 scopus 로고    scopus 로고
    • Direct adaptive controller for nonaffine nonlinear systems using self-structuring neural networks
    • Park J.H., Huh S.H., Kim S.H., Seo S.J., Park G.T. Direct adaptive controller for nonaffine nonlinear systems using self-structuring neural networks. IEEE Transactions on Neural Networks 2005, 16(2):414-422.
    • (2005) IEEE Transactions on Neural Networks , vol.16 , Issue.2 , pp. 414-422
    • Park, J.H.1    Huh, S.H.2    Kim, S.H.3    Seo, S.J.4    Park, G.T.5
  • 32
    • 0004057553 scopus 로고
    • McGraw-Hill, Inc., Singapore
    • Rudin W. Functional analysis 1991, McGraw-Hill, Inc., Singapore. 2nd ed.
    • (1991) Functional analysis
    • Rudin, W.1
  • 33
    • 0035273403 scopus 로고    scopus 로고
    • On-line learning control by association and reinforcement
    • Si J., Wang Y.T. On-line learning control by association and reinforcement. IEEE Transactions on Neural Networks 2001, 12(2):264-276.
    • (2001) IEEE Transactions on Neural Networks , vol.12 , Issue.2 , pp. 264-276
    • Si, J.1    Wang, Y.T.2
  • 35
    • 77950630017 scopus 로고    scopus 로고
    • Online actor-critic algorithm to solve the continuous-time infinite horizon optimal control problem
    • Vamvoudakis K.G., Lewis F.L. Online actor-critic algorithm to solve the continuous-time infinite horizon optimal control problem. Automatica 2010, 46(5):878-888.
    • (2010) Automatica , vol.46 , Issue.5 , pp. 878-888
    • Vamvoudakis, K.G.1    Lewis, F.L.2
  • 36
    • 79960897012 scopus 로고    scopus 로고
    • Multi-player non-zero-sum games: online adaptive learning solution of coupled Hamilton-Jacobi equations
    • Vamvoudakis K.G., Lewis F.L. Multi-player non-zero-sum games: online adaptive learning solution of coupled Hamilton-Jacobi equations. Automatica 2011, 47(8):1556-1569.
    • (2011) Automatica , vol.47 , Issue.8 , pp. 1556-1569
    • Vamvoudakis, K.G.1    Lewis, F.L.2
  • 37
    • 82755160758 scopus 로고    scopus 로고
    • Finite-horizon neuro-optimal tracking control for a class of discrete-time nonlinear systems using adaptive dynamic programming approach
    • Wang D., Liu D., Wei Q. Finite-horizon neuro-optimal tracking control for a class of discrete-time nonlinear systems using adaptive dynamic programming approach. Neurocomputing 2012, 78(1):14-22.
    • (2012) Neurocomputing , vol.78 , Issue.1 , pp. 14-22
    • Wang, D.1    Liu, D.2    Wei, Q.3
  • 38
    • 84864489666 scopus 로고    scopus 로고
    • Optimal control of unknown nonaffine nonlinear discrete-time systems based on adaptive dynamic programming
    • Wang D., Liu D., Wei Q., Zhao D., Jin N. Optimal control of unknown nonaffine nonlinear discrete-time systems based on adaptive dynamic programming. Automatica 2012, 48(8):1825-1832.
    • (2012) Automatica , vol.48 , Issue.8 , pp. 1825-1832
    • Wang, D.1    Liu, D.2    Wei, Q.3    Zhao, D.4    Jin, N.5
  • 40
    • 84862811062 scopus 로고    scopus 로고
    • An iterative ε-optimal control scheme for a class of discrete-time nonlinear systems with unfixed initial state
    • Wei Q., Liu D. An iterative ε-optimal control scheme for a class of discrete-time nonlinear systems with unfixed initial state. Neural Networks 2012, 32:236-244.
    • (2012) Neural Networks , vol.32 , pp. 236-244
    • Wei, Q.1    Liu, D.2
  • 41
    • 0002011091 scopus 로고
    • A menu of designs for reinforcement learning over time
    • MIT Press, Cambridge, MA, W.T. Miller, R.S. Sutton, P.J. Werbos (Eds.)
    • Werbos P.J. A menu of designs for reinforcement learning over time. Neural networks for control 1991, 67-95. MIT Press, Cambridge, MA. W.T. Miller, R.S. Sutton, P.J. Werbos (Eds.).
    • (1991) Neural networks for control , pp. 67-95
    • Werbos, P.J.1
  • 42
    • 0002031779 scopus 로고
    • Approximate dynamic programming for real-time control and neural modeling
    • Van Nostrand Reinhold, New York, D.A. White, D.A. Sofge (Eds.)
    • Werbos P.J. Approximate dynamic programming for real-time control and neural modeling. Handbook of intelligent control 1992, 493-525. Van Nostrand Reinhold, New York. D.A. White, D.A. Sofge (Eds.).
    • (1992) Handbook of intelligent control , pp. 493-525
    • Werbos, P.J.1
  • 43
    • 34548766755 scopus 로고    scopus 로고
    • Using ADP to understand and replicate brain intelligence: the next level design
    • In Proceedings of the IEEE symposium on approximate dynamic programming and reinforcement learning. Honolulu, HI, April.
    • Werbos, P.J. (2007). Using ADP to understand and replicate brain intelligence: the next level design. In Proceedings of the IEEE symposium on approximate dynamic programming and reinforcement learning (pp. 209-216). Honolulu, HI, April.
    • (2007) , pp. 209-216
    • Werbos, P.J.1
  • 44
    • 49049091767 scopus 로고    scopus 로고
    • ADP: the key direction for future research in intelligent control and understanding brain intelligence
    • Werbos P.J. ADP: the key direction for future research in intelligent control and understanding brain intelligence. IEEE Transactions on Systems, Man, Cybernetics, Part B, Cybernetics 2008, 38(4):898-900.
    • (2008) IEEE Transactions on Systems, Man, Cybernetics, Part B, Cybernetics , vol.38 , Issue.4 , pp. 898-900
    • Werbos, P.J.1
  • 45
  • 46
    • 84887035183 scopus 로고    scopus 로고
    • Neural-network-based online optimal control for uncertain non-linear continuous-time systems with control constraints
    • Yang X., Liu D., Huang Y. Neural-network-based online optimal control for uncertain non-linear continuous-time systems with control constraints. IET Control Theory & Applications 2013, 7(17):2037-2047.
    • (2013) IET Control Theory & Applications , vol.7 , Issue.17 , pp. 2037-2047
    • Yang, X.1    Liu, D.2    Huang, Y.3
  • 48
    • 49049091364 scopus 로고    scopus 로고
    • Control of nonaffine nonlinear discrete-time systems using reiforcement-learning-based linearly parameterized neural networks
    • Yang Q., Vance J.B., Jagannathan S. Control of nonaffine nonlinear discrete-time systems using reiforcement-learning-based linearly parameterized neural networks. IEEE Transactions on Systems, Man, Cybernetics, Part B, Cybernetics 2008, 38(4):994-1001.
    • (2008) IEEE Transactions on Systems, Man, Cybernetics, Part B, Cybernetics , vol.38 , Issue.4 , pp. 994-1001
    • Yang, Q.1    Vance, J.B.2    Jagannathan, S.3
  • 51
    • 0036060726 scopus 로고    scopus 로고
    • Direct RBF neural network control of a class of discrete-time nonaffine nonlinear systems
    • In Proceeding of the American control conference. Anchorage, AK, May.
    • Zhang, J., Ge, S.S., & Lee, T.H. (2002). Direct RBF neural network control of a class of discrete-time nonaffine nonlinear systems. In Proceeding of the American control conference (pp. 424-429). Anchorage, AK, May.
    • (2002) , pp. 424-429
    • Zhang, J.1    Ge, S.S.2    Lee, T.H.3
  • 52
    • 78650805234 scopus 로고    scopus 로고
    • An iterative adaptive dynamic programming method for solving a class of nonlinear zero-sum differential games
    • Zhang H., Wei Q., Liu D. An iterative adaptive dynamic programming method for solving a class of nonlinear zero-sum differential games. Automatica 2011, 47(1):207-214.
    • (2011) Automatica , vol.47 , Issue.1 , pp. 207-214
    • Zhang, H.1    Wei, Q.2    Liu, D.3


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.