메뉴 건너뛰기




Volumn 22, Issue 1-3, 1996, Pages 59-94

Feature-based methods for large scale dynamic programming

Author keywords

Compact representation; Curse of dimensionality; Dynamic programming; Features; Function approximation; Neuro dynamic programming; Reinforcement learning

Indexed keywords

APPROXIMATION THEORY; CONVERGENCE OF NUMERICAL METHODS; DYNAMIC PROGRAMMING; ERRORS; FEATURE EXTRACTION; LARGE SCALE SYSTEMS; LEARNING ALGORITHMS; PROBLEM SOLVING; STOCHASTIC CONTROL SYSTEMS;

EID: 0029752470     PISSN: 08856125     EISSN: None     Source Type: Journal    
DOI: 10.1007/BF00114724     Document Type: Article
Times cited : (422)

References (22)
  • 1
    • 0027147212 scopus 로고
    • Wave-Net: A Multiresolution, Hierarchical Neural Network with Localized Learning
    • Bakshi, B. R. & Stephanopoulos G., (1993). "Wave-Net: A Multiresolution, Hierarchical Neural Network with Localized Learning," AIChE Journal, vol. 39, no 1, pp. 57-81.
    • (1993) AIChE Journal , vol.39 , Issue.1 , pp. 57-81
    • Bakshi, B.R.1    Stephanopoulos, G.2
  • 2
    • 0029210635 scopus 로고
    • Real time Learning and Control Using Asynchronous Dynamic Programming
    • Barto, A. G., Bradtke, S. J., & Singh, S. P., (1995). "Real time Learning and Control Using Asynchronous Dynamic Programming," Artificial Intelligence, vol. 72, pp. 81-138.
    • (1995) Artificial Intelligence , vol.72 , pp. 81-138
    • Barto, A.G.1    Bradtke, S.J.2    Singh, S.P.3
  • 3
    • 84968519017 scopus 로고
    • Functional Approximation and Dynamic Programming
    • Bellman, R. E. & Dreyfus, S. E., (1959) "Functional Approximation and Dynamic Programming," Math. Tables and Other Aids Comp., Vol. 13, pp. 247-251.
    • (1959) Math. Tables and Other Aids Comp. , vol.13 , pp. 247-251
    • Bellman, R.E.1    Dreyfus, S.E.2
  • 5
    • 0000268954 scopus 로고
    • A Counter-Example to Temporal Differences Learning
    • Bertsekas, D. P. (1994) "A Counter-Example to Temporal Differences Learning," Neural Computation, vol 7, pp. 270-279.
    • (1994) Neural Computation , vol.7 , pp. 270-279
    • Bertsekas, D.P.1
  • 6
    • 0024680419 scopus 로고
    • Adaptive Aggregation for Infinite Horizon Dynamic Programming
    • Bertsekas D. P. & Castañon, D. A., (1989). 'Adaptive Aggregation for Infinite Horizon Dynamic Programming," IEEE Transactions on Automatic Control, Vol. 34, No. 6, pp. 589-598.
    • (1989) IEEE Transactions on Automatic Control , vol.34 , Issue.6 , pp. 589-598
    • Bertsekas, D.P.1    Castañon, D.A.2
  • 8
    • 0000430514 scopus 로고
    • The Convergence of TD(λ) for General λ
    • Dayan, P. D., (1992) "The Convergence of TD(λ) for General λ," Machine Learning, vol. 8, pp. 341-362.
    • (1992) Machine Learning , vol.8 , pp. 341-362
    • Dayan, P.D.1
  • 9
    • 0038595393 scopus 로고
    • Stable Function Approximation in Dynamic Programming
    • Carnegie Mellon University
    • Gordon, G. J., (1995). "Stable Function Approximation in Dynamic Programming," Technical Report CMU-CS-9-103, Carnegie Mellon University.
    • (1995) Technical Report CMU-CS-9-103
    • Gordon, G.J.1
  • 10
    • 0000439891 scopus 로고
    • On the Convergence of Stochastic Iterative Dynamic Programming Algorithms
    • Jaakola, T., Jordan M. I., &. Singh, S. P., (1994). "On the Convergence of Stochastic Iterative Dynamic Programming Algorithms," Neural Computation, Vol. 6. No. 6.
    • (1994) Neural Computation , vol.6 , Issue.6
    • Jaakola, T.1    Jordan, M.I.2    Singh, S.P.3
  • 11
    • 0000624333 scopus 로고
    • Reinforcement Algorithms for Partially Observable Markovian Decision Processes
    • J. D. Cowan G. Tesauro, and D. Touretzky, editors, Morgan Kaufmann
    • Jaakola T., Singh S. P., & Jordan, M. I., (1995). "Reinforcement Algorithms for Partially Observable Markovian Decision Processes," in Advances in Neural information Processing Systems, 7, J. D. Cowan G. Tesauro, and D. Touretzky, editors, Morgan Kaufmann.
    • (1995) Advances in Neural Information Processing Systems , vol.7
    • Jaakola, T.1    Singh, S.P.2    Jordan, M.I.3
  • 12
    • 0023421864 scopus 로고
    • Planning as Search. A Quantitative Approach
    • Korf, R. E. (1987). "Planning as Search. A Quantitative Approach," Artificial Intelligence, vol 33, pp. 65-88.
    • (1987) Artificial Intelligence , vol.33 , pp. 65-88
    • Korf, R.E.1
  • 13
    • 0012046853 scopus 로고
    • LNKner: Neural Network, Machine Learning, and Statistical Software for Pattern Classification
    • Lippman, R. E. Kukolich, L. & Singer, E., (1993). LNKner: Neural Network, Machine Learning, and Statistical Software for Pattern Classification," The Lincoln Laboratory Journal, vol 6. no 2, pp 249-268
    • (1993) The Lincoln Laboratory Journal , vol.6 , Issue.2
    • Lippman, R.E.1    Kukolich, L.2    Singer, E.3
  • 14
    • 0345184460 scopus 로고
    • Computational Advances in Dynamic Programming
    • edited by Puterman, M.L.
    • Morin, T. J., (1987) "Computational Advances in Dynamic Programming," in Dynamic Programming and Its Applications, edited by Puterman, M.L., pp. 53-90.
    • (1987) Dynamic Programming and Its Applications , pp. 53-90
    • Morin, T.J.1
  • 15
    • 0025490985 scopus 로고
    • Networks for Approximation and learning
    • Poggio, T. & Girosi, F., (1990) "Networks for Approximation and learning," Proceedings of the IEEE, vol. 78, no. 9, pp. 1481-1497.
    • (1990) Proceedings of the IEEE , vol.78 , Issue.9 , pp. 1481-1497
    • Poggio, T.1    Girosi, F.2
  • 18
    • 33847202724 scopus 로고
    • Learning to Predict by the Method of Temporal Differences
    • Sutton, R. S., (1988) "Learning to Predict by the Method of Temporal Differences,"Machine Learning, vol. 3, pp. 9-44.
    • (1988) Machine Learning , vol.3 , pp. 9-44
    • Sutton, R.S.1
  • 19
    • 0001046225 scopus 로고
    • Practical Issues in Temporal Difference Learning
    • Tesauro, G., (1992) "Practical Issues in Temporal Difference Learning," Machine Learning, vol. 8, pp 257-277.
    • (1992) Machine Learning , vol.8 , pp. 257-277
    • Tesauro, G.1
  • 20
    • 0028497630 scopus 로고
    • Asynchronous Stochastic Approximation and Q-Learning
    • Tsitsiklis, J. N., (1994) "Asynchronous Stochastic Approximation and Q-Learning," Machine Learning, vol. 16, pp 185-202.
    • (1994) Machine Learning , vol.16 , pp. 185-202
    • Tsitsiklis, J.N.1
  • 22
    • 0017997986 scopus 로고
    • Approximations of Dynamic Programs I
    • Whitt, W., (1978). Approximations of Dynamic Programs I Mathematics of Operations Research, vol. 3, pp. 231-243
    • (1978) Mathematics of Operations Research , vol.3 , pp. 231-243
    • Whitt, W.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.