메뉴 건너뛰기




Volumn 20, Issue 2, 2016, Pages 697-706

Neuro-optimal tracking control for a class of discrete-time nonlinear systems via generalized value iteration adaptive dynamic programming approach

Author keywords

Adaptive critic designs; Adaptive dynamic programming; Approximate dynamic programming; Neural networks; Nonlinear systems; Optimal control; Reinforcement learning

Indexed keywords

ADAPTIVE CONTROL SYSTEMS; ALGORITHMS; DISCRETE TIME CONTROL SYSTEMS; ITERATIVE METHODS; NAVIGATION; NEURAL NETWORKS; NONLINEAR SYSTEMS; REINFORCEMENT LEARNING;

EID: 84955703536     PISSN: 14327643     EISSN: 14337479     Source Type: Journal    
DOI: 10.1007/s00500-014-1533-0     Document Type: Article
Times cited : (32)

References (53)
  • 1
    • 14844340822 scopus 로고    scopus 로고
    • Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network HJB approach
    • Abu-Khalaf M, Lewis FL (2005) Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network HJB approach. Automatica 41(5):779–791
    • (2005) Automatica , vol.41 , Issue.5 , pp. 779-791
    • Abu-Khalaf, M.1    Lewis, F.L.2
  • 2
    • 33847648898 scopus 로고    scopus 로고
    • Adaptive critic designs for discrete-time zero-sum games with application to (Formula presented.) control
    • Al-Tamimi A, Abu-Khalaf M, Lewis FL (2007) Adaptive critic designs for discrete-time zero-sum games with application to $$H_{\infty }$$H∞ control. IEEE Trans Syst Cybern Part B: Cybern 37(1):240–247
    • (2007) IEEE Trans Syst Cybern Part B: Cybern , vol.37 , Issue.1 , pp. 240-247
    • Al-Tamimi, A.1    Abu-Khalaf, M.2    Lewis, F.L.3
  • 3
    • 49049089962 scopus 로고    scopus 로고
    • Discrete-time nonlinear HJB solution using approximate dynamic programming: convergence proof
    • Al-Tamimi A, Lewis FL, Abu-Khalaf M (2008) Discrete-time nonlinear HJB solution using approximate dynamic programming: convergence proof. IEEE Trans Syst Man Cybern Part B: Cybern 38(4):943–949
    • (2008) IEEE Trans Syst Man Cybern Part B: Cybern , vol.38 , Issue.4 , pp. 943-949
    • Al-Tamimi, A.1    Lewis, F.L.2    Abu-Khalaf, M.3
  • 4
    • 84871319455 scopus 로고    scopus 로고
    • A novel actor-critic-identifier architecture for approximate optimal control of uncertain nonlinear systems
    • Bhasin S, Kamalapurkar R, Johnson M, Vamvoudakis KG, Lewis FL, Dixon WE (2013) A novel actor-critic-identifier architecture for approximate optimal control of uncertain nonlinear systems. Automatica 49(1):82–92
    • (2013) Automatica , vol.49 , Issue.1 , pp. 82-92
    • Bhasin, S.1    Kamalapurkar, R.2    Johnson, M.3    Vamvoudakis, K.G.4    Lewis, F.L.5    Dixon, W.E.6
  • 5
    • 85012688561 scopus 로고
    • Princeton University Press, Princeton
    • Bellman RE (1957) Dynamic programming. Princeton University Press, Princeton
    • (1957) Dynamic programming
    • Bellman, R.E.1
  • 8
    • 84901054552 scopus 로고    scopus 로고
    • Utilizing time-linkage property in DOPs: an information sharing based artificial bee colony algorithm for tracking multiple optima in uncertain environments
    • Biswas S, Das S, Kundu S, Patra GR (2014) Utilizing time-linkage property in DOPs: an information sharing based artificial bee colony algorithm for tracking multiple optima in uncertain environments. Soft Comput 18(6):1199–1212
    • (2014) Soft Comput , vol.18 , Issue.6 , pp. 1199-1212
    • Biswas, S.1    Das, S.2    Kundu, S.3    Patra, G.R.4
  • 9
    • 84871294033 scopus 로고    scopus 로고
    • On functional equations for (Formula presented.)th best policies in Markov decision processes
    • Chang HS (2013) On functional equations for $$K$$Kth best policies in Markov decision processes. Automatica 49(1):297–300
    • (2013) Automatica , vol.49 , Issue.1 , pp. 297-300
    • Chang, H.S.1
  • 10
    • 0043026775 scopus 로고    scopus 로고
    • Helicopter trimming and tracking control using direct neural dynamic programming
    • Enns R, Si J (2003) Helicopter trimming and tracking control using direct neural dynamic programming. IEEE Trans Neural Netw 14(8):929–939
    • (2003) IEEE Trans Neural Netw , vol.14 , Issue.8 , pp. 929-939
    • Enns, R.1    Si, J.2
  • 11
    • 84925291138 scopus 로고    scopus 로고
    • Fortier N, Sheppard J, Strasser S (2014) Abductive inference in Bayesian networks using distributed overlapping swarm intelligence. Soft Comput (in press)
    • Fortier N, Sheppard J, Strasser S (2014) Abductive inference in Bayesian networks using distributed overlapping swarm intelligence. Soft Comput (in press). doi:10.1007/s00500-014-1310-0
  • 12
    • 84880065287 scopus 로고    scopus 로고
    • Finite-horizon control-constrained nonlinear optimal control using single network adaptive critics
    • Heydari A, Balakrishnan SN (2013) Finite-horizon control-constrained nonlinear optimal control using single network adaptive critics. IEEE Trans Neural Netw Learn Syst 24(1):145–157
    • (2013) IEEE Trans Neural Netw Learn Syst , vol.24 , Issue.1 , pp. 145-157
    • Heydari, A.1    Balakrishnan, S.N.2
  • 13
    • 84872032832 scopus 로고    scopus 로고
    • An algorithm for robust explicit/multi-parametric model predictive control
    • Kouramas KI, Panos C, Faisca NP, Pistikopoulos EN (2013) An algorithm for robust explicit/multi-parametric model predictive control. Automatica 49(2):381–389
    • (2013) Automatica , vol.49 , Issue.2 , pp. 381-389
    • Kouramas, K.I.1    Panos, C.2    Faisca, N.P.3    Pistikopoulos, E.N.4
  • 14
    • 84922835782 scopus 로고    scopus 로고
    • Kundu S, Das S, Vasilakos AV, Biswas S (2014) A modified differential evolution-based combined routing and sleep scheduling scheme for lifetime maximization of wireless sensor networks. Soft Comput (in press)
    • Kundu S, Das S, Vasilakos AV, Biswas S (2014) A modified differential evolution-based combined routing and sleep scheduling scheme for lifetime maximization of wireless sensor networks. Soft Comput (in press). doi:10.1007/s00500-014-1286-9
  • 15
    • 84883537695 scopus 로고    scopus 로고
    • Reinforcement learning and feedback control: using natural decision methods to design optimal adaptive controllers
    • Lewis FL, Vrabie D, Vamvoudakis KG (2012) Reinforcement learning and feedback control: using natural decision methods to design optimal adaptive controllers. IEEE Control Syst 32(6):76–105
    • (2012) IEEE Control Syst , vol.32 , Issue.6 , pp. 76-105
    • Lewis, F.L.1    Vrabie, D.2    Vamvoudakis, K.G.3
  • 16
  • 18
    • 84881555023 scopus 로고    scopus 로고
    • Finite-approximation-error-based optimal control approach for discrete-time nonlinear systems
    • Liu D, Wei Q (2013) Finite-approximation-error-based optimal control approach for discrete-time nonlinear systems. IEEE Trans Cybern 43(2):779–789
    • (2013) IEEE Trans Cybern , vol.43 , Issue.2 , pp. 779-789
    • Liu, D.1    Wei, Q.2
  • 19
    • 84899122972 scopus 로고    scopus 로고
    • Multi-person zero-sum differential games for a class of uncertain nonlinear systems
    • Liu D, Wei Q (2014a) Multi-person zero-sum differential games for a class of uncertain nonlinear systems. Int J Adaptive Control Signal Process 28(3–5):205–231
    • (2014) Int J Adaptive Control Signal Process , vol.28 , Issue.3-5 , pp. 205-231
    • Liu, D.1    Wei, Q.2
  • 20
    • 84897594646 scopus 로고    scopus 로고
    • Policy iteration adaptive dynamic programming algorithm for discrete-time nonlinear systems
    • Liu D, Wei Q (2014b) Policy iteration adaptive dynamic programming algorithm for discrete-time nonlinear systems. IEEE Trans Neural Netw Learn Syst 25(3):621–634
    • (2014) IEEE Trans Neural Netw Learn Syst , vol.25 , Issue.3 , pp. 621-634
    • Liu, D.1    Wei, Q.2
  • 21
    • 26844483839 scopus 로고    scopus 로고
    • A self-learning call admission control scheme for CDMA cellular networks
    • Liu D, Zhang Y, Zhang H (2005) A self-learning call admission control scheme for CDMA cellular networks. IEEE Trans Neural Netw 16(5):1219–1228
    • (2005) IEEE Trans Neural Netw , vol.16 , Issue.5 , pp. 1219-1228
    • Liu, D.1    Zhang, Y.2    Zhang, H.3
  • 22
    • 0019625194 scopus 로고
    • Optimal control of a class of nonlinear stochastic systems
    • Mohler RR, Kolodziej WJ (1981) Optimal control of a class of nonlinear stochastic systems. IEEE Trans Autom Control 26(5):1048–1054
    • (1981) IEEE Trans Autom Control , vol.26 , Issue.5 , pp. 1048-1054
    • Mohler, R.R.1    Kolodziej, W.J.2
  • 24
    • 84885936244 scopus 로고    scopus 로고
    • Heuristic dynamic programming with internal goal representation
    • Ni Z, He H (2013) Heuristic dynamic programming with internal goal representation. Soft Comput 17(11):2101–2108
    • (2013) Soft Comput , vol.17 , Issue.11 , pp. 2101-2108
    • Ni, Z.1    He, H.2
  • 27
    • 84947093016 scopus 로고    scopus 로고
    • Rubio JDJ (2014) Adaptive least square control in discrete time of robotic arms. Soft Comput (in press)
    • Rubio JDJ (2014) Adaptive least square control in discrete time of robotic arms. Soft Comput (in press). doi:10.1007/s00500-014-1300-2
  • 28
    • 0015039815 scopus 로고
    • System equivalence in a class of nonlinear optimal control problems
    • Rugh WJ (1971) System equivalence in a class of nonlinear optimal control problems. IEEE Trans Autom Control 16(2):189–194
    • (1971) IEEE Trans Autom Control , vol.16 , Issue.2 , pp. 189-194
    • Rugh, W.J.1
  • 29
    • 0035273403 scopus 로고    scopus 로고
    • On-line learning control by association and reinforcement
    • Si J, Wang YT (2001) On-line learning control by association and reinforcement. IEEE Trans Neural Netw 12(2):264–276
    • (2001) IEEE Trans Neural Netw , vol.12 , Issue.2 , pp. 264-276
    • Si, J.1    Wang, Y.T.2
  • 30
    • 84885923700 scopus 로고    scopus 로고
    • Multi-objective optimal control for a class of nonlinear time-delay systems via adaptive dynamic programming
    • Song R, Xiao W, Wei Q (2013) Multi-objective optimal control for a class of nonlinear time-delay systems via adaptive dynamic programming. Soft Comput 17(11):2109–2115
    • (2013) Soft Comput , vol.17 , Issue.11 , pp. 2109-2115
    • Song, R.1    Xiao, W.2    Wei, Q.3
  • 31
    • 84905102927 scopus 로고    scopus 로고
    • Neural-network-based approach to finite-time optimal control for a class of unknown nonlinear systems
    • Song R, Xiao W, Wei Q, Sun C (2014) Neural-network-based approach to finite-time optimal control for a class of unknown nonlinear systems. Soft Comput 18(8):1645–1653
    • (2014) Soft Comput , vol.18 , Issue.8 , pp. 1645-1653
    • Song, R.1    Xiao, W.2    Wei, Q.3    Sun, C.4
  • 33
    • 66449130966 scopus 로고    scopus 로고
    • Adaptive dynamic programming: an introduction
    • Wang F, Zhang H, Liu D (2009) Adaptive dynamic programming: an introduction. IEEE Comput Intell Mag 4(2):39–47
    • (2009) IEEE Comput Intell Mag , vol.4 , Issue.2 , pp. 39-47
    • Wang, F.1    Zhang, H.2    Liu, D.3
  • 34
    • 78651311269 scopus 로고    scopus 로고
    • Adaptive dynamic programming for finite-horizon optimal control of discrete-time nonlinear systems with (Formula presented.)-error bound
    • Wang F, Jin N, Liu D, Wei Q (2011) Adaptive dynamic programming for finite-horizon optimal control of discrete-time nonlinear systems with $$\epsilon $$ϵ-error bound. IEEE Trans Neural Netw 22(1):24–36
    • (2011) IEEE Trans Neural Netw , vol.22 , Issue.1 , pp. 24-36
    • Wang, F.1    Jin, N.2    Liu, D.3    Wei, Q.4
  • 35
    • 84862811062 scopus 로고    scopus 로고
    • An iterative (Formula presented.)-optimal control scheme for a class of discrete-time nonlinear systems with unfixed initial state
    • Wei Q, Liu D (2012) An iterative $$\epsilon $$ϵ-optimal control scheme for a class of discrete-time nonlinear systems with unfixed initial state. Neural Netw 32:236–244
    • (2012) Neural Netw , vol.32 , pp. 236-244
    • Wei, Q.1    Liu, D.2
  • 36
    • 84883327795 scopus 로고    scopus 로고
    • Numerical adaptive learning control scheme for discrete-time nonlinear systems
    • Wei Q, Liu D (2013) Numerical adaptive learning control scheme for discrete-time nonlinear systems. IET Control Theory Appl 7(11):1472–1486
    • (2013) IET Control Theory Appl , vol.7 , Issue.11 , pp. 1472-1486
    • Wei, Q.1    Liu, D.2
  • 37
    • 84887490966 scopus 로고    scopus 로고
    • Dual iterative adaptive dynamic programming for a class of discrete-time nonlinear systems with time-delays
    • Wei Q, Wang D, Zhang D (2013) Dual iterative adaptive dynamic programming for a class of discrete-time nonlinear systems with time-delays. Neural Comput Appl 23(7–8):1851–1863
    • (2013) Neural Comput Appl , vol.23 , Issue.7-8 , pp. 1851-1863
    • Wei, Q.1    Wang, D.2    Zhang, D.3
  • 38
    • 84906778934 scopus 로고    scopus 로고
    • Adaptive dynamic programming for optimal tracking control of unknown nonlinear systems with application to coal gasification
    • Wei Q, Liu D (2014a) Adaptive dynamic programming for optimal tracking control of unknown nonlinear systems with application to coal gasification. IEEE Trans Autom Sci Eng 11(4):1020–1036
    • (2014) IEEE Trans Autom Sci Eng , vol.11 , Issue.4 , pp. 1020-1036
    • Wei, Q.1    Liu, D.2
  • 39
    • 84908658175 scopus 로고    scopus 로고
    • A novel iterative (Formula presented.)-adaptive dynamic programming for discrete-time nonlinear systems
    • Wei Q, Liu D (2014b) A novel iterative $$\theta $$θ-adaptive dynamic programming for discrete-time nonlinear systems. IEEE Trans Autom Sci Eng 11(4):1176–1190
    • (2014) IEEE Trans Autom Sci Eng , vol.11 , Issue.4 , pp. 1176-1190
    • Wei, Q.1    Liu, D.2
  • 40
    • 84902352795 scopus 로고    scopus 로고
    • Data-driven neuro-optimal temperature control of water gas shift reaction using stable iterative adaptive dynamic programming
    • Wei Q, Liu D (2014c) Data-driven neuro-optimal temperature control of water gas shift reaction using stable iterative adaptive dynamic programming. IEEE Trans Ind Electron 61(11):6399–6408
    • (2014) IEEE Trans Ind Electron , vol.61 , Issue.11 , pp. 6399-6408
    • Wei, Q.1    Liu, D.2
  • 41
    • 84898013913 scopus 로고    scopus 로고
    • Stable iterative adaptive dynamic programming algorithm with approximation errors for discrete-time nonlinear systems
    • Wei Q, Liu D (2014d) Stable iterative adaptive dynamic programming algorithm with approximation errors for discrete-time nonlinear systems. Neural Comput Appl 24(6):1355–1367
    • (2014) Neural Comput Appl , vol.24 , Issue.6 , pp. 1355-1367
    • Wei, Q.1    Liu, D.2
  • 42
    • 84924872284 scopus 로고    scopus 로고
    • Wei Q, Liu D, Shi G (2014) A novel dual iterative Q-learning method for optimal battery management in smart residential environments. IEEE Trans Ind Electron (in press)
    • Wei Q, Liu D, Shi G (2014) A novel dual iterative Q-learning method for optimal battery management in smart residential environments. IEEE Trans Ind Electron (in press). doi:10.1109/TIE.2014.2361485
  • 43
    • 84912122528 scopus 로고    scopus 로고
    • Wei Q, Wang F, Liu D, Yang X (2014) Finite-approximation-error based discrete-time iterative adaptive dynamic programming. IEEE Trans Cybern (in press)
    • Wei Q, Wang F, Liu D, Yang X (2014) Finite-approximation-error based discrete-time iterative adaptive dynamic programming. IEEE Trans Cybern (in press). doi:10.1109/TCYB.2014.2354377
  • 44
    • 61849184281 scopus 로고    scopus 로고
    • Model-free multiobjective approximate dynamic programming for discrete-time nonlinear systems with general performance index functions
    • Wei Q, Zhang H, Dai J (2009) Model-free multiobjective approximate dynamic programming for discrete-time nonlinear systems with general performance index functions. Neurocomputing 72(7–9):1839–1848
    • (2009) Neurocomputing , vol.72 , Issue.7-9 , pp. 1839-1848
    • Wei, Q.1    Zhang, H.2    Dai, J.3
  • 45
    • 0002557583 scopus 로고
    • Advanced forecasting methods for global crisis warning and models of intelligence
    • Werbos PJ (1977) Advanced forecasting methods for global crisis warning and models of intelligence. General Syst Yearb 22:25–38
    • (1977) General Syst Yearb , vol.22 , pp. 25-38
    • Werbos, P.J.1
  • 46
    • 0002011091 scopus 로고
    • A menu of designs for reinforcement learning over time
    • Miller WT, Sutton RS, Werbos PJ, (eds), MIT Press, Cambridge
    • Werbos PJ (1991) A menu of designs for reinforcement learning over time. In: Miller WT, Sutton RS, Werbos PJ (eds) Neural Netw Control. MIT Press, Cambridge
    • (1991) Neural Netw Control
    • Werbos, P.J.1
  • 47
    • 0002031779 scopus 로고
    • Approximate dynamic programming for real-time control and neural modeling
    • White DA, Sofge DA, (eds), Van Nostrand Reinhold, New York
    • Werbos PJ (1992) Approximate dynamic programming for real-time control and neural modeling. In: White DA, Sofge DA (eds) Handbook of intelligent control: neural, fuzzy, and adaptive approaches. Van Nostrand Reinhold, New York
    • (1992) Handbook of intelligent control: neural, fuzzy, and adaptive approaches
    • Werbos, P.J.1
  • 48
    • 84884958993 scopus 로고    scopus 로고
    • Stochastic optimal controller design for uncertain nonlinear networked control system via neuro dynamic programming
    • Xu H, Jagannathan S (2013) Stochastic optimal controller design for uncertain nonlinear networked control system via neuro dynamic programming. IEEE Trans Neural Netw Learn Syst 24(3):471–484
    • (2013) IEEE Trans Neural Netw Learn Syst , vol.24 , Issue.3 , pp. 471-484
    • Xu, H.1    Jagannathan, S.2
  • 49
    • 84885835001 scopus 로고    scopus 로고
    • Near-optimal control for nonzero-sum differential games of continuous-time nonlinear systems using single-network ADP
    • Zhang H, Cui L, Luo Y (2013) Near-optimal control for nonzero-sum differential games of continuous-time nonlinear systems using single-network ADP. IEEE Trans Cybern 43(1):206–216
    • (2013) IEEE Trans Cybern , vol.43 , Issue.1 , pp. 206-216
    • Zhang, H.1    Cui, L.2    Luo, Y.3
  • 50
    • 84892670912 scopus 로고    scopus 로고
    • Approximate optimal solution of the DTHJB equation for a class of nonlinear affine systems with unknown dead-zone constraints
    • Zhang D, Liu D, Wang D (2014) Approximate optimal solution of the DTHJB equation for a class of nonlinear affine systems with unknown dead-zone constraints. Soft Comput 18(2):349–357
    • (2014) Soft Comput , vol.18 , Issue.2 , pp. 349-357
    • Zhang, D.1    Liu, D.2    Wang, D.3
  • 51
    • 70349253929 scopus 로고    scopus 로고
    • The RBF neural network-based near-optimal control for a class of discrete-time affine nonlinear systems with control constraint
    • Zhang H, Luo Y, Liu D (2009) The RBF neural network-based near-optimal control for a class of discrete-time affine nonlinear systems with control constraint. IEEE Trans Neural Netw 20(9):1490–1503
    • (2009) IEEE Trans Neural Netw , vol.20 , Issue.9 , pp. 1490-1503
    • Zhang, H.1    Luo, Y.2    Liu, D.3
  • 52
    • 49049119493 scopus 로고    scopus 로고
    • A novel infinite-time optimal tracking control scheme for a class of discrete-time nonlinear systems via the greedy HDP iteration algorithm
    • Zhang H, Wei Q, Luo Y (2008) A novel infinite-time optimal tracking control scheme for a class of discrete-time nonlinear systems via the greedy HDP iteration algorithm. IEEE Trans Syst Man Cybern Part B Cybern 38(4):937–942
    • (2008) IEEE Trans Syst Man Cybern Part B Cybern , vol.38 , Issue.4 , pp. 937-942
    • Zhang, H.1    Wei, Q.2    Luo, Y.3
  • 53
    • 78650805234 scopus 로고    scopus 로고
    • An iterative adaptive dynamic programming method for solving a class of nonlinear zero-sum differential games
    • Zhang H, Wei Q, Liu D (2011) An iterative adaptive dynamic programming method for solving a class of nonlinear zero-sum differential games. Automatica 47(1):207–214
    • (2011) Automatica , vol.47 , Issue.1 , pp. 207-214
    • Zhang, H.1    Wei, Q.2    Liu, D.3


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.