메뉴 건너뛰기




Volumn 2, Issue , 2009, Pages 396-399

A survey of approximate dynamic programming

Author keywords

Approximate dynamic programming; Dynamic programming; Markov decision processes; Reinforcement learning

Indexed keywords

APPROXIMATE DYNAMIC PROGRAMMING; COMPUTATIONAL REQUIREMENTS; DECISION PROBLEMS; FUNCTION APPROXIMATION; IN-PROCESS; MARKOV DECISION PROCESSES; MATHEMATICAL FORMULATION; MULTI-STAGE; REAL APPLICATIONS; RESEARCH DIRECTIONS; STANDARD METHOD; STATE SPACE;

EID: 73649096185     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1109/IHMSC.2009.222     Document Type: Conference Paper
Times cited : (13)

References (37)
  • 1
  • 2
    • 84923005963 scopus 로고    scopus 로고
    • Approximate Dynamic Programming for High-Dimensional Resource Allocation Problems
    • IEEE Press John Wiley & sons, Inc
    • W. B. Powell and B. Van Roy, "Approximate Dynamic Programming for High-Dimensional Resource Allocation Problems," HANDBOOK of LEARNING and APPROXIMATE DYNAMIC PROGRAMMING, IEEE Press John Wiley & sons, Inc. 2004, pp.261-284
    • (2004) HANDBOOK of LEARNING and APPROXIMATE DYNAMIC PROGRAMMING , pp. 261-284
    • Powell, W.B.1    Van Roy, B.2
  • 3
    • 0003950434 scopus 로고    scopus 로고
    • Stable Adaptive Control Using New Critic Designs
    • ArXiv.org: adaporg/9810001
    • P. Werbos, "Stable Adaptive Control Using New Critic Designs," 1998 , (ArXiv.org: adaporg/9810001).
    • (1998)
    • Werbos, P.1
  • 4
    • 0002557583 scopus 로고
    • Advanced forecasting for global crisis warning and models of intelligence
    • P. Werbos, "Advanced forecasting for global crisis warning and models of intelligence," General Systems Yearbook, 1977.
    • (1977) General Systems Yearbook
    • Werbos, P.1
  • 6
    • 0015667648 scopus 로고
    • Punish/reward: Learning with a Critic in adaptive threshold systems
    • B. Widrow, N, Gupta and S. Maitra, "Punish/reward: learning with a Critic in adaptive threshold systems," IEEE Trans. SMC, vol. 5, 1973, pp.455-465.
    • (1973) IEEE Trans. SMC , vol.5 , pp. 455-465
    • Widrow, B.1    Gupta, N.2    Maitra, S.3
  • 7
    • 0020970738 scopus 로고
    • Neuronlike adaptive elements that can solve difficult learning control problems
    • A. Barto, R. Sutton and C. Anderson, "Neuronlike adaptive elements that can solve difficult learning control problems," IEEE Trans. SMC, vol. 13, 1983, pp.834-846.
    • (1983) IEEE Trans. SMC , vol.13 , pp. 834-846
    • Barto, A.1    Sutton, R.2    Anderson, C.3
  • 8
    • 73649144578 scopus 로고    scopus 로고
    • Dynamic Programming and Suboptimal Control: A Survey from ADP to MPC,
    • 2632
    • D. Dimitri, P. Bertsekas, "Dynamic Programming and Suboptimal Control: A Survey from ADP to MPC," 2005, Report LIDS 2632.
    • (2005) Report LIDS
    • Dimitri, D.1    Bertsekas, P.2
  • 9
    • 0041345290 scopus 로고    scopus 로고
    • Efficient Reinforcement Learning Using Recursive Least-Squares Methods
    • Xin Xu, Han-gen He and Dewen Hu, "Efficient Reinforcement Learning Using Recursive Least-Squares Methods," Journal of Artificial Intelligence Research , Vol.16 , 2002, pp.259-292.
    • (2002) Journal of Artificial Intelligence Research , vol.16 , pp. 259-292
    • Xu, X.1    He, H.-G.2    Hu, D.3
  • 10
    • 0000430514 scopus 로고
    • The Convergence of TD(λ) for General λ
    • P. D. Dayan, "The Convergence of TD(λ) for General λ," Machine Learning, vol. 8, 1992, pp.341-362.
    • (1992) Machine Learning , vol.8 , pp. 341-362
    • Dayan, P.D.1
  • 11
    • 73649094483 scopus 로고    scopus 로고
    • An Analysis of Temporal-Difference Learning with Function Approximation
    • John N. Tsitsiklis and Benjamin Van Roy, "An Analysis of Temporal-Difference Learning with Function Approximation" Van Roy's homepages , 1997.
    • (1997) Van Roy's homepages
    • Tsitsiklis, J.N.1    Van Roy, B.2
  • 12
    • 0141704189 scopus 로고    scopus 로고
    • Accelerating Critic Learning in Approximate Dynamic Programming Via Value Templates and Perceptual Learning
    • T. T. Shannon, R. A. Santiago and G. Lendaris, "Accelerating Critic Learning in Approximate Dynamic Programming Via Value Templates and Perceptual Learning," IEEEE 0-7803-7898-9/03, 2003, pp.2922-2927
    • (2003) IEEEE 0-7803-7898-9/03 , pp. 2922-2927
    • Shannon, T.T.1    Santiago, R.A.2    Lendaris, G.3
  • 13
    • 33847661590 scopus 로고    scopus 로고
    • Adaptive Critic Design Based Neuro-Fuzzy Controller for a Static Compensator in a Multimachine Power System
    • S. Mohagheghi and Ganesh K. Venayagamoorthy, "Adaptive Critic Design Based Neuro-Fuzzy Controller for a Static Compensator in a Multimachine Power System," IEEE Transactions on Power Syatems, vol. 21, NO. 4, 2006 pp.1744-1755.
    • (2006) IEEE Transactions on Power Syatems , vol.21 , Issue.4 , pp. 1744-1755
    • Mohagheghi, S.1    Venayagamoorthy, G.K.2
  • 24
    • 15744363553 scopus 로고    scopus 로고
    • Ju Jiang, M. Kamel and Lei Chen, Reinforcement Learning and Aggregation, Proceedings of IEEE International Conference on Systems, Man, and Cybernetics 04, 2004, pp.1303-1308.
    • Ju Jiang, M. Kamel and Lei Chen, "Reinforcement Learning and Aggregation," Proceedings of IEEE International Conference on Systems, Man, and Cybernetics 04, 2004, pp.1303-1308.
  • 26
    • 0004102479 scopus 로고    scopus 로고
    • A Bradford Book, The MIT Press, Cambridge, Massachusetts, London, England, ISBN 0-262-19398-1
    • R. S. Sutton and A. G. Barto, "Reinforcement Learning, An Introduction." A Bradford Book, The MIT Press, Cambridge, Massachusetts, London, England, ISBN 0-262-19398-1, 1998.
    • (1998) Reinforcement Learning, An Introduction
    • Sutton, R.S.1    Barto, A.G.2
  • 28
    • 34547365679 scopus 로고    scopus 로고
    • An Extension of Genetic Network Programming with Reinforcement Learning Using Actor-Critic
    • H. Hatakeyama and S. Mabu, "An Extension of Genetic Network Programming with Reinforcement Learning Using Actor-Critic," 2006 IEEE Congress on Evolutionary Computation, 2006.
    • (2006) IEEE Congress on Evolutionary Computation
    • Hatakeyama, H.1    Mabu, S.2
  • 32
    • 33745951445 scopus 로고    scopus 로고
    • TANG Hao, ZHOU Lei and YUAN Ji-bin, Unified NDP method based on TD(0) learning for both average and discounted Markov decision processes, Control Theory& Application, vo1.23, no.2, 2006, pp.292-297.
    • TANG Hao, ZHOU Lei and YUAN Ji-bin, "Unified NDP method based on TD(0) learning for both average and discounted Markov decision processes," Control Theory& Application, vo1.23, no.2, 2006, pp.292-297.
  • 33
    • 23444449149 scopus 로고    scopus 로고
    • TANG Hao, YUAN Ji-Bin, LU Yang, and CHENG Wen-Juan, Performance Potential-based Neuro-dynamic Programming for SMDPs, ACTA AUTOMATICA SINICA, 31, no. 4, 2005, pp.642-646.
    • TANG Hao, YUAN Ji-Bin, LU Yang, and CHENG Wen-Juan, "Performance Potential-based Neuro-dynamic Programming for SMDPs," ACTA AUTOMATICA SINICA, vol. 31, no. 4, 2005, pp.642-646.
  • 34
    • 2942718962 scopus 로고    scopus 로고
    • A Simulation Optimization Algorithm for CTMDPs Based on Randomized Stationary Policies
    • TANG Hao, XI Hong-Sheng and YIN Bo-Qun, "A Simulation Optimization Algorithm for CTMDPs Based on Randomized Stationary Policies," ACTA AUTOMATICA SINICA, vol. 30, No.2, 2004, pp.229-235.
    • (2004) ACTA AUTOMATICA SINICA , vol.30 , Issue.2 , pp. 229-235
    • Hao, T.A.N.G.1    Hong-Sheng, X.I.2    Bo-Qun, Y.I.N.3
  • 35
    • 33747872589 scopus 로고    scopus 로고
    • Approximate dynamic programming based approach to process control and scheduling
    • J. H. Lee and J. M. Lee, "Approximate dynamic programming based approach to process control and scheduling," Computers and Chemical Engineering, no. 30, 2006, pp.1603-1618.
    • (2006) Computers and Chemical Engineering , Issue.30 , pp. 1603-1618
    • Lee, J.H.1    Lee, J.M.2
  • 36
    • 18444379381 scopus 로고    scopus 로고
    • Approximate dynamic programming based approaches for input-output data-driven control of nonlinear processes
    • J. M. Lee and J. H. Lee, "Approximate dynamic programming based approaches for input-output data-driven control of nonlinear processes," Automatica, vol. 41, no. 7, 2005, pp.281-1288.
    • (2005) Automatica , vol.41 , Issue.7 , pp. 281-1288
    • Lee, J.M.1    Lee, J.H.2
  • 37
    • 27144544987 scopus 로고    scopus 로고
    • Choice of approximator and design of penalty function for an approximate dynamic programming based control approach
    • J. M. Lee, N. S. Kaisare and J. H. Lee, "Choice of approximator and design of penalty function for an approximate dynamic programming based control approach," Journal of Process Control, vol.16, no. 2, 2006, pp.135-156.
    • (2006) Journal of Process Control , vol.16 , Issue.2 , pp. 135-156
    • Lee, J.M.1    Kaisare, N.S.2    Lee, J.H.3


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.