메뉴 건너뛰기




Volumn 26, Issue 6, 2015, Pages 1323-1334

Error bounds of adaptive dynamic programming algorithms for solving undiscounted optimal control problems

Author keywords

Adaptive critic designs; Adaptive dynamic programming (ADP); Approximate dynamic programming; Neural networks; Neurodynamic programming; Nonlinear systems; Optimal control

Indexed keywords

ADAPTIVE CONTROL SYSTEMS; ALGORITHMS; DISCRETE TIME CONTROL SYSTEMS; ERROR ANALYSIS; ERRORS; ITERATIVE METHODS; NEURAL NETWORKS; NONLINEAR SYSTEMS; OPTIMAL CONTROL SYSTEMS;

EID: 84930194751     PISSN: 2162237X     EISSN: 21622388     Source Type: Journal    
DOI: 10.1109/TNNLS.2015.2402203     Document Type: Article
Times cited : (62)

References (62)
  • 2
    • 85012688561 scopus 로고
    • Princeton, NJ, USA: Princeton Univ. Press
    • R. E. Bellman, Dynamic Programming. Princeton, NJ, USA: Princeton Univ. Press, 1957.
    • (1957) Dynamic Programming
    • Bellman, R.E.1
  • 3
    • 66449130966 scopus 로고    scopus 로고
    • Adaptive dynamic programming: An introduction
    • May
    • F.-Y. Wang, H. Zhang, and D. Liu, "Adaptive dynamic programming: An introduction," IEEE Comput. Intell. Mag., vol. 4, no. 2, pp. 39-47, May 2009.
    • (2009) IEEE Comput. Intell. Mag , vol.4 , Issue.2 , pp. 39-47
    • Wang, F.-Y.1    Zhang, H.2    Liu, D.3
  • 4
    • 0002031779 scopus 로고
    • Approximate dynamic programming for real-time control and neural modeling
    • D. A. White and D. A. Sofge, Eds. New York, NY, USA: Van Nostrand, ch. 13
    • P. J. Werbos, "Approximate dynamic programming for real-time control and neural modeling," in Handbook of Intelligent Control: Neural, Fuzzy, and Adaptive Approaches, D. A. White and D. A. Sofge, Eds. New York, NY, USA: Van Nostrand, 1992, ch. 13.
    • (1992) Handbook of Intelligent Control: Neural, Fuzzy, and Adaptive Approaches
    • Werbos, P.J.1
  • 8
    • 26844483839 scopus 로고    scopus 로고
    • A self-learning call admission control scheme for CDMA cellular networks
    • Sep
    • D. Liu, Y. Zhang, and H. Zhang, "A self-learning call admission control scheme for CDMA cellular networks," IEEE Trans. Neural Netw., vol. 16, no. 5, pp. 1219-1228, Sep. 2005.
    • (2005) IEEE Trans. Neural Netw , vol.16 , Issue.5 , pp. 1219-1228
    • Liu, D.1    Zhang, Y.2    Zhang, H.3
  • 9
    • 49049108697 scopus 로고    scopus 로고
    • Adaptive critic learning techniques for engine torque and air-fuel ratio control
    • Aug
    • D. Liu, H. Javaherian, O. Kovalenko, and T. Huang, "Adaptive critic learning techniques for engine torque and air-fuel ratio control," IEEE Trans. Syst., Man, Cybern. B, Cybern., vol. 38, no. 4, pp. 988-993, Aug. 2008.
    • (2008) IEEE Trans. Syst., Man, Cybern. B, Cybern , vol.38 , Issue.4 , pp. 988-993
    • Liu, D.1    Javaherian, H.2    Kovalenko, O.3    Huang, T.4
  • 10
    • 84872594962 scopus 로고    scopus 로고
    • A self-learning scheme for residential energy system control and management
    • Feb
    • T. Huang and D. Liu, "A self-learning scheme for residential energy system control and management," Neural Comput. Appl., vol. 22, no. 2, pp. 259-269, Feb. 2013.
    • (2013) Neural Comput. Appl , vol.22 , Issue.2 , pp. 259-269
    • Huang, T.1    Liu, D.2
  • 11
    • 84902352795 scopus 로고    scopus 로고
    • Data-driven neuro-optimal temperature control of water-gas shift reaction using stable iterative adaptive dynamic programming
    • Nov
    • Q. Wei and D. Liu, "Data-driven neuro-optimal temperature control of water-gas shift reaction using stable iterative adaptive dynamic programming," IEEE Trans. Ind. Electron., vol. 61, no. 11, pp. 6399-6408, Nov. 2014.
    • (2014) IEEE Trans. Ind. Electron , vol.61 , Issue.11 , pp. 6399-6408
    • Wei, Q.1    Liu, D.2
  • 12
    • 84876066909 scopus 로고    scopus 로고
    • Neural-network-based zero-sum game for discrete-time nonlinear systems via iterative adaptive dynamic programming algorithm
    • Jun
    • D. Liu, H. Li, and D. Wang, "Neural-network-based zero-sum game for discrete-time nonlinear systems via iterative adaptive dynamic programming algorithm," Neurocomputing, vol. 110, pp. 92-100, Jun. 2013.
    • (2013) Neurocomputing , vol.110 , pp. 92-100
    • Liu, D.1    Li, H.2    Wang, D.3
  • 13
    • 84904398037 scopus 로고    scopus 로고
    • Integral reinforcement learning for linear continuous-time zero-sum games with completely unknown dynamics
    • Jul
    • H. Li, D. Liu, and D. Wang, "Integral reinforcement learning for linear continuous-time zero-sum games with completely unknown dynamics," IEEE Trans. Autom. Sci. Eng., vol. 11, no. 3, pp. 706-714, Jul. 2014.
    • (2014) IEEE Trans. Autom. Sci. Eng , vol.11 , Issue.3 , pp. 706-714
    • Li, H.1    Liu, D.2    Wang, D.3
  • 14
    • 84904706555 scopus 로고    scopus 로고
    • Online synchronous approximate optimal learning algorithm for multi-player non-zero-sum games with unknown dynamics
    • Aug
    • D. Liu, H. Li, and D. Wang, "Online synchronous approximate optimal learning algorithm for multi-player non-zero-sum games with unknown dynamics," IEEE Trans. Syst., Man, Cybern., Syst., vol. 44, no. 8, pp. 1015-1027, Aug. 2014.
    • (2014) IEEE Trans. Syst., Man, Cybern., Syst , vol.44 , Issue.8 , pp. 1015-1027
    • Liu, D.1    Li, H.2    Wang, D.3
  • 15
    • 84893640946 scopus 로고    scopus 로고
    • Decentralized stabilization for a class of continuous-time nonlinear interconnected systems using online learning optimal control approach
    • Feb
    • D. Liu, D. Wang, and H. Li, "Decentralized stabilization for a class of continuous-time nonlinear interconnected systems using online learning optimal control approach," IEEE Trans. Neural Netw. Learn. Syst., vol. 25, no. 2, pp. 418-428, Feb. 2014.
    • (2014) IEEE Trans. Neural Netw. Learn. Syst , vol.25 , Issue.2 , pp. 418-428
    • Liu, D.1    Wang, D.2    Li, H.3
  • 16
    • 84961378056 scopus 로고    scopus 로고
    • Leader-based optimal coordination control for the consensus problem of multiagent differential games via fuzzy adaptive dynamic programming
    • Feb
    • H. Zhang, J. Zhang, G.-H. Yang, and Y. Luo, "Leader-based optimal coordination control for the consensus problem of multiagent differential games via fuzzy adaptive dynamic programming," IEEE Trans. Fuzzy Syst., vol. 23, no. 1, pp. 152-163, Feb. 2015.
    • (2015) IEEE Trans. Fuzzy Syst , vol.23 , Issue.1 , pp. 152-163
    • Zhang, H.1    Zhang, J.2    Yang, G.-H.3    Luo, Y.4
  • 17
    • 70349116541 scopus 로고    scopus 로고
    • Reinforcement learning and adaptive dynamic programming for feedback control
    • Jul
    • F. L. Lewis and D. Vrabie, "Reinforcement learning and adaptive dynamic programming for feedback control," IEEE Circuits Syst. Mag., vol. 9, no. 3, pp. 32-50, Jul. 2009.
    • (2009) IEEE Circuits Syst. Mag , vol.9 , Issue.3 , pp. 32-50
    • Lewis, F.L.1    Vrabie, D.2
  • 18
    • 84883537695 scopus 로고    scopus 로고
    • Reinforcement learning and feedback control: Using natural decision methods to design optimal adaptive controllers
    • Dec
    • F. L. Lewis, D. Vrabie, and K. G. Vamvoudakis, "Reinforcement learning and feedback control: Using natural decision methods to design optimal adaptive controllers," IEEE Circuits Syst. Mag., vol. 32, no. 6, pp. 76-105, Dec. 2012.
    • (2012) IEEE Circuits Syst. Mag , vol.32 , Issue.6 , pp. 76-105
    • Lewis, F.L.1    Vrabie, D.2    Vamvoudakis, K.G.3
  • 20
    • 0037581251 scopus 로고
    • Modifed policy iteration algorithms for discounted Markov decision problems
    • M. L. Puterman and M. C. Shin, "Modifed policy iteration algorithms for discounted Markov decision problems," Manage. Sci., vol. 24, no. 11, pp. 1127-1137, 1978.
    • (1978) Manage. Sci , vol.24 , Issue.11 , pp. 1127-1137
    • Puterman, M.L.1    Shin, M.C.2
  • 21
    • 49049089962 scopus 로고    scopus 로고
    • Discrete-time nonlinear HJB solution using approximate dynamic programming: Convergence proof
    • Aug
    • A. Al-Tamimi, F. L. Lewis, and M. Abu-Khalaf, "Discrete-time nonlinear HJB solution using approximate dynamic programming: Convergence proof," IEEE Trans. Syst., Man, Cybern. B, Cybern., vol. 38, no. 4, pp. 943-949, Aug. 2008.
    • (2008) IEEE Trans. Syst., Man, Cybern. B, Cybern , vol.38 , Issue.4 , pp. 943-949
    • Al-Tamimi, A.1    Lewis, F.L.2    Abu-Khalaf, M.3
  • 22
    • 68149180889 scopus 로고    scopus 로고
    • Optimal control of unknown affine nonlinear discrete-time systems using offline-trained neural networks with proof of convergence
    • Jul./Aug
    • T. Dierks, B. T. Thumati, and S. Jagannathan, "Optimal control of unknown affine nonlinear discrete-time systems using offline-trained neural networks with proof of convergence," Neural Netw., vol. 22, nos. 5-6, pp. 851-860, Jul./Aug. 2009.
    • (2009) Neural Netw , vol.22 , Issue.5-6 , pp. 851-860
    • Dierks, T.1    Thumati, B.T.2    Jagannathan, S.3
  • 23
    • 70349253929 scopus 로고    scopus 로고
    • Neural-network-based near-optimal control for a class of discrete-time affine nonlinear systems with control constraints
    • Sep
    • H. Zhang, Y. Luo, and D. Liu, "Neural-network-based near-optimal control for a class of discrete-time affine nonlinear systems with control constraints," IEEE Trans. Neural Netw., vol. 20, no. 9, pp. 1490-1503, Sep. 2009.
    • (2009) IEEE Trans. Neural Netw , vol.20 , Issue.9 , pp. 1490-1503
    • Zhang, H.1    Luo, Y.2    Liu, D.3
  • 24
    • 84868467610 scopus 로고    scopus 로고
    • An iterative adaptive dynamic programming algorithm for optimal control of unknown discrete-time nonlinear systems with constrained inputs
    • Jan
    • D. Liu, D. Wang, and X. Yang, "An iterative adaptive dynamic programming algorithm for optimal control of unknown discrete-time nonlinear systems with constrained inputs," Inf. Sci., vol. 220, pp. 331-342, Jan. 2013.
    • (2013) Inf. Sci , vol.220 , pp. 331-342
    • Liu, D.1    Wang, D.2    Yang, X.3
  • 25
    • 78651311269 scopus 로고    scopus 로고
    • Adaptive dynamic programming for finite-horizon optimal control of discrete-time nonlinear systems with ε-error bound
    • Jan
    • F.-Y. Wang, N. Jin, D. Liu, and Q. Wei, "Adaptive dynamic programming for finite-horizon optimal control of discrete-time nonlinear systems with ε-error bound," IEEE Trans. Neural Netw., vol. 22, no. 1, pp. 24-36, Jan. 2011.
    • (2011) IEEE Trans. Neural Netw , vol.22 , Issue.1 , pp. 24-36
    • Wang, F.-Y.1    Jin, N.2    Liu, D.3    Wei, Q.4
  • 26
    • 84880065287 scopus 로고    scopus 로고
    • Finite-horizon control-constrained nonlinear optimal control using single network adaptive critics
    • Jan
    • A. Heydari and S. N. Balakrishnan, "Finite-horizon control-constrained nonlinear optimal control using single network adaptive critics," IEEE Trans. Neural Netw. Learn. Syst., vol. 24, no. 1, pp. 145-157, Jan. 2013.
    • (2013) IEEE Trans. Neural Netw. Learn. Syst , vol.24 , Issue.1 , pp. 145-157
    • Heydari, A.1    Balakrishnan, S.N.2
  • 27
    • 49049119493 scopus 로고    scopus 로고
    • A novel infinite-time optimal tracking control scheme for a class of discrete-time nonlinear systems via the greedy HDP iteration algorithm
    • Aug
    • H. Zhang, Q. Wei, and Y. Luo, "A novel infinite-time optimal tracking control scheme for a class of discrete-time nonlinear systems via the greedy HDP iteration algorithm," IEEE Trans. Syst., Man, Cybern. B, Cybern., vol. 38, no. 4, pp. 937-942, Aug. 2008.
    • (2008) IEEE Trans. Syst., Man, Cybern. B, Cybern , vol.38 , Issue.4 , pp. 937-942
    • Zhang, H.1    Wei, Q.2    Luo, Y.3
  • 28
    • 82755160758 scopus 로고    scopus 로고
    • Finite-horizon neuro-optimal tracking control for a class of discrete-time nonlinear systems using adaptive dynamic programming approach
    • Feb
    • D. Wang, D. Liu, and Q. Wei, "Finite-horizon neuro-optimal tracking control for a class of discrete-time nonlinear systems using adaptive dynamic programming approach," Neurocomputing, vol. 78, no. 1, pp. 14-22, Feb. 2012.
    • (2012) Neurocomputing , vol.78 , Issue.1 , pp. 14-22
    • Wang, D.1    Liu, D.2    Wei, Q.3
  • 29
    • 83855165164 scopus 로고    scopus 로고
    • Optimal tracking control for a class of nonlinear discrete-time systems with time delays based on heuristic dynamic programming
    • Dec
    • H. Zhang, R. Song, Q. Wei, and T. Zhang, "Optimal tracking control for a class of nonlinear discrete-time systems with time delays based on heuristic dynamic programming," IEEE Trans. Neural Netw., vol. 22, no. 12, pp. 1851-1862, Dec. 2011.
    • (2011) IEEE Trans. Neural Netw , vol.22 , Issue.12 , pp. 1851-1862
    • Zhang, H.1    Song, R.2    Wei, Q.3    Zhang, T.4
  • 30
    • 84863467146 scopus 로고    scopus 로고
    • Neural-network-based optimal control for a class of unknown discrete-time nonlinear systems using globalized dual heuristic programming
    • Jul
    • D. Liu, D. Wang, D. Zhao, Q. Wei, and N. Jin, "Neural-network-based optimal control for a class of unknown discrete-time nonlinear systems using globalized dual heuristic programming," IEEE Trans. Autom. Sci. Eng., vol. 9, no. 3, pp. 628-634, Jul. 2012.
    • (2012) IEEE Trans. Autom. Sci. Eng , vol.9 , Issue.3 , pp. 628-634
    • Liu, D.1    Wang, D.2    Zhao, D.3    Wei, Q.4    Jin, N.5
  • 31
    • 84864489666 scopus 로고    scopus 로고
    • Optimal control of unknown nonaffine nonlinear discrete-time systems based on adaptive dynamic programming
    • Aug
    • D. Wang, D. Liu, Q. Wei, D. Zhao, and N. Jin, "Optimal control of unknown nonaffine nonlinear discrete-time systems based on adaptive dynamic programming," Automatica, vol. 48, no. 8, pp. 1825-1832, Aug. 2012.
    • (2012) Automatica , vol.48 , Issue.8 , pp. 1825-1832
    • Wang, D.1    Liu, D.2    Wei, Q.3    Zhao, D.4    Jin, N.5
  • 34
    • 14844340822 scopus 로고    scopus 로고
    • Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network HJB approach
    • May
    • M. Abu-Khalaf and F. L. Lewis, "Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network HJB approach," Automatica, vol. 41, no. 5, pp. 779-791, May 2005.
    • (2005) Automatica , vol.41 , Issue.5 , pp. 779-791
    • Abu-Khalaf, M.1    Lewis, F.L.2
  • 35
    • 84912073419 scopus 로고    scopus 로고
    • Neural-networkbased online HJB solution for optimal robust guaranteed cost control of continuous-time uncertain nonlinear systems
    • Dec
    • D. Liu, D. Wang, F.-Y. Wang, H. Li, and X. Yang, "Neural-networkbased online HJB solution for optimal robust guaranteed cost control of continuous-time uncertain nonlinear systems," IEEE Trans. Cybern., vol. 44, no. 12, pp. 2834-2847, Dec. 2014.
    • (2014) IEEE Trans. Cybern , vol.44 , Issue.12 , pp. 2834-2847
    • Liu, D.1    Wang, D.2    Wang, F.-Y.3    Li, H.4    Yang, X.5
  • 36
    • 39549085591 scopus 로고    scopus 로고
    • Generalized Hamilton-Jacobi-Bellman formulation-based neural network control of affine nonlinear discretetime systems
    • Jan
    • Z. Chen and S. Jagannathan, "Generalized Hamilton-Jacobi-Bellman formulation-based neural network control of affine nonlinear discretetime systems," IEEE Trans. Neural Netw., vol. 19, no. 1, pp. 90-106, Jan. 2008.
    • (2008) IEEE Trans. Neural Netw , vol.19 , Issue.1 , pp. 90-106
    • Chen, Z.1    Jagannathan, S.2
  • 37
    • 84897594646 scopus 로고    scopus 로고
    • Policy iteration adaptive dynamic programming algorithm for discrete-time nonlinear systems
    • Mar
    • D. Liu and Q. Wei, "Policy iteration adaptive dynamic programming algorithm for discrete-time nonlinear systems," IEEE Trans. Neural Netw. Learn. Syst., vol. 25, no. 3, pp. 621-634, Mar. 2014.
    • (2014) IEEE Trans. Neural Netw. Learn. Syst , vol.25 , Issue.3 , pp. 621-634
    • Liu, D.1    Wei, Q.2
  • 38
    • 0042466434 scopus 로고    scopus 로고
    • On the convergence of optimistic policy iteration
    • Jul
    • J. N. Tsitsiklis, "On the convergence of optimistic policy iteration," J. Mach. Learn. Res., vol. 3, pp. 59-72, Jul. 2002.
    • (2002) J. Mach. Learn. Res , vol.3 , pp. 59-72
    • Tsitsiklis, J.N.1
  • 39
    • 84930218305 scopus 로고    scopus 로고
    • Weighted sup-norm contractions in dynamic programming: A review and some new applications
    • Cambridge, MA, USA, Tech. Rep. LIDS-P-2884 May
    • D. P. Bertsekas, "Weighted sup-norm contractions in dynamic programming: A review and some new applications," Dept. Elect. Eng. Comput. Sci., Massachusetts Inst. Technol., Cambridge, MA, USA, Tech. Rep. LIDS-P-2884, May 2012.
    • (2012) Dept. Elect. Eng. Comput. Sci., Massachusetts Inst. Technol
    • Bertsekas, D.P.1
  • 40
    • 79960439729 scopus 로고    scopus 로고
    • Approximate policy iteration: A survey and some new methods
    • D. P. Bertsekas, "Approximate policy iteration: A survey and some new methods," J. Control Theory Appl., vol. 9, no. 3, pp. 310-335, 2011.
    • (2011) J. Control Theory Appl , vol.9 , Issue.3 , pp. 310-335
    • Bertsekas, D.P.1
  • 42
    • 33748418040 scopus 로고    scopus 로고
    • Performance loss bounds for approximate value iteration with state aggregation
    • B. V. Roy, "Performance loss bounds for approximate value iteration with state aggregation," Math. Oper. Res., vol. 31 no. 2, pp. 234-244, 2006.
    • (2006) Math. Oper. Res , vol.31 , Issue.2 , pp. 234-244
    • Roy, B.V.1
  • 43
    • 29344453913 scopus 로고    scopus 로고
    • Error bounds for approximate value iteration
    • Pittsburgh, PA, USA, Jul
    • R. Munos, "Error bounds for approximate value iteration," in Proc. Nat. Conf. Artif. Intell., Pittsburgh, PA, USA, Jul. 2005, pp. 1006-1011.
    • (2005) Proc. Nat. Conf. Artif. Intell , pp. 1006-1011
    • Munos, R.1
  • 44
    • 40949107944 scopus 로고    scopus 로고
    • Performance bounds in L p-norm for approximate value iteration
    • R. Munos, "Performance bounds in L p-norm for approximate value iteration," SIAM J. Control Optim., vol. 46, no. 2, pp. 541-561, 2007.
    • (2007) SIAM J. Control Optim , vol.46 , Issue.2 , pp. 541-561
    • Munos, R.1
  • 45
    • 1942516880 scopus 로고    scopus 로고
    • Error bounds for approximate policy iteration
    • Washington, DC, USA, Aug
    • R. Munos, "Error bounds for approximate policy iteration," in Proc. 20th Int. Conf. Mach. Learn., Washington, DC, USA, Aug. 2003, pp. 560-567.
    • (2003) Proc. 20th Int. Conf. Mach. Learn , pp. 560-567
    • Munos, R.1
  • 47
    • 77955513754 scopus 로고    scopus 로고
    • Approximate robust policy iteration using multilayer perceptron neural networks for discounted infinite-horizon Markov decision processes with uncertain correlated transition matrices
    • Aug
    • B. Li and J. Si, "Approximate robust policy iteration using multilayer perceptron neural networks for discounted infinite-horizon Markov decision processes with uncertain correlated transition matrices," IEEE Trans. Neural Netw., vol. 21, no. 8, pp. 1270-1280, Aug. 2010.
    • (2010) IEEE Trans. Neural Netw , vol.21 , Issue.8 , pp. 1270-1280
    • Li, B.1    Si, J.2
  • 50
    • 0002526302 scopus 로고
    • Construction of suboptimal control sequences
    • R. J. Leake and R.-W. Liu, "Construction of suboptimal control sequences," SIAM J. Control, vol. 5, no. 1, pp. 54-63, 1967.
    • (1967) SIAM J. Control , vol.5 , Issue.1 , pp. 54-63
    • Leake, R.J.1    Liu, R.-W.2
  • 51
    • 33749860519 scopus 로고    scopus 로고
    • Relaxed dynamic programming in switching systems
    • Sep
    • A. Rantzer, "Relaxed dynamic programming in switching systems," IEE Proc. Control Theory Appl., vol. 153, no. 5, pp. 567-574, Sep. 2006.
    • (2006) IEE Proc. Control Theory Appl , vol.153 , Issue.5 , pp. 567-574
    • Rantzer, A.1
  • 52
    • 33747862706 scopus 로고    scopus 로고
    • Relaxing dynamic programming
    • Aug
    • B. Lincoln and A. Rantzer, "Relaxing dynamic programming," IEEE Trans. Autom. Control, vol. 51, no. 8, pp. 1249-1260, Aug. 2006.
    • (2006) IEEE Trans. Autom. Control , vol.51 , Issue.8 , pp. 1249-1260
    • Lincoln, B.1    Rantzer, A.2
  • 53
    • 54349120326 scopus 로고    scopus 로고
    • On the infinite horizon performance of receding horizon controllers
    • Oct
    • L. Grune and A. Rantzer, "On the infinite horizon performance of receding horizon controllers," IEEE Trans. Autom. Control, vol. 53, no. 9, pp. 2100-2111, Oct. 2008.
    • (2008) IEEE Trans. Autom. Control , vol.53 , Issue.9 , pp. 2100-2111
    • Grune, L.1    Rantzer, A.2
  • 54
    • 84881555023 scopus 로고    scopus 로고
    • Finite-approximation-error based optimal control approach for discrete-time nonlinear systems
    • Apr
    • D. Liu and Q. Wei, "Finite-approximation-error based optimal control approach for discrete-time nonlinear systems," IEEE Trans. Cybern., vol. 43, no. 2, pp. 779-789, Apr. 2013.
    • (2013) IEEE Trans. Cybern , vol.43 , Issue.2 , pp. 779-789
    • Liu, D.1    Wei, Q.2
  • 55
    • 84912122528 scopus 로고    scopus 로고
    • Finite-approximationerror- based discrete-time iterative adaptive dynamic programming
    • Dec
    • Q. Wei, F.-Y. Wang, D. Liu, and X. Yang, "Finite-approximationerror- based discrete-time iterative adaptive dynamic programming," IEEE Trans. Cybern., vol. 44, no. 12, pp. 2820-2833, Dec. 2014.
    • (2014) IEEE Trans. Cybern , vol.44 , Issue.12 , pp. 2820-2833
    • Wei, Q.1    Wang, F.-Y.2    Liu, D.3    Yang, X.4
  • 57
    • 84870443947 scopus 로고    scopus 로고
    • Optimal approximation schedules for a class of iterative algorithms, with an application to multigrid value iteration
    • Dec
    • A. Almudevar and E. F. de Arruda, "Optimal approximation schedules for a class of iterative algorithms, with an application to multigrid value iteration," IEEE Trans. Autom. Control, vol. 57, no. 12, pp. 3132-3146, Dec. 2012.
    • (2012) IEEE Trans. Autom. Control , vol.57 , Issue.12 , pp. 3132-3146
    • Almudevar, A.1    De Arruda, E.F.2
  • 58
    • 34547098844 scopus 로고    scopus 로고
    • Kernel-based least squares policy iteration for reinforcement learning
    • Jul
    • X. Xu, D. Hu, and X. Lu, "Kernel-based least squares policy iteration for reinforcement learning," IEEE Trans. Neural Netw., vol. 18, no. 4, pp. 973-992, Jul. 2007.
    • (2007) IEEE Trans. Neural Netw , vol.18 , Issue.4 , pp. 973-992
    • Xu, X.1    Hu, D.2    Lu, X.3
  • 59
    • 35748957806 scopus 로고    scopus 로고
    • Proto-value functions: A Laplacian framework for learning representation and control in Markov decision processes
    • S. Mahadevan and M. Maggioni, "Proto-value functions: A Laplacian framework for learning representation and control in Markov decision processes," J. Mach. Learn. Res., vol. 8, no. 10, pp. 2169-2231, 2007.
    • (2007) J. Mach. Learn. Res , vol.8 , Issue.10 , pp. 2169-2231
    • Mahadevan, S.1    Maggioni, M.2
  • 60
    • 84912071084 scopus 로고    scopus 로고
    • A clustering-based graph Laplacian framework for value function approximation in reinforcement learning
    • Dec
    • X. Xu, Z. Huang, D. Graves, and W. Pedrycz, "A clustering-based graph Laplacian framework for value function approximation in reinforcement learning," IEEE Trans. Cybern., vol. 44, no. 12, pp. 2613-2625, Dec. 2014.
    • (2014) IEEE Trans. Cybern , vol.44 , Issue.12 , pp. 2613-2625
    • Xu, X.1    Huang, Z.2    Graves, D.3    Pedrycz, W.4
  • 61
    • 84906666833 scopus 로고    scopus 로고
    • Reinforcement learning with automatic basis construction based on isometric feature mapping
    • Dec
    • Z. Huang, X. Xu, and L. Zuo, "Reinforcement learning with automatic basis construction based on isometric feature mapping," Inf. Sci., vol. 286, pp. 209-227, Dec. 2014.
    • (2014) Inf. Sci , vol.286 , pp. 209-227
    • Huang, Z.1    Xu, X.2    Zuo, L.3
  • 62
    • 84878421441 scopus 로고    scopus 로고
    • Optimal control for discrete-time affine non-linear systems using general value iteration
    • Dec
    • H. Li and D. Liu, "Optimal control for discrete-time affine non-linear systems using general value iteration," IET Control Theory Appl., vol. 6, no. 18, pp. 2725-2736, Dec. 2012.
    • (2012) IET Control Theory Appl , vol.6 , Issue.18 , pp. 2725-2736
    • Li, H.1    Liu, D.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.