메뉴 건너뛰기




Volumn , Issue , 2012, Pages

A comparison of learning speed and ability to cope without exploration between DHP and TD(0)

Author keywords

Adaptive Dynamic Programming; DHP; Dual Heuristic Dynamic Programming; Reinforcement Learning

Indexed keywords

ADAPTIVE DYNAMIC PROGRAMMING; CONTINUOUS STATE SPACE; DHP; DIFFERENTIABILITY; DUAL HEURISTIC DYNAMIC PROGRAMMING; LEARNING METHODS; LEARNING SPEED; MODEL FUNCTIONS;

EID: 84865077338     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1109/IJCNN.2012.6252569     Document Type: Conference Paper
Times cited : (7)

References (21)
  • 3
    • 85012688561 scopus 로고
    • Princeton NJ, USA: Princeton University Press
    • R. E. Bellman, Dynamic Programming. Princeton, NJ, USA: Princeton University Press, 1957.
    • (1957) Dynamic Programming
    • Bellman, R.E.1
  • 4
    • 33847202724 scopus 로고
    • Learning to predict by the methods of temporal differences
    • R. S. Sutton, "Learning to predict by the methods of temporal differences," Machine Learning, vol. 3, pp. 9-44, 1988.
    • (1988) Machine Learning , vol.3 , pp. 9-44
    • Sutton, R.S.1
  • 5
    • 0003636089 scopus 로고
    • On-line q-learning using connectionist systems
    • Cambridge University Engineering Department
    • G. Rummery and M. Niranjan, "On-line q-learning using connectionist systems," Tech. Rep. Technical Report CUED/F-INFENG/TR 166, Cambridge University Engineering Department, 1994.
    • (1994) Tech. Rep. Technical Report CUED/F-INFENG/TR 166
    • Rummery, G.1    Niranjan, M.2
  • 7
    • 0002031779 scopus 로고
    • Approximating dynamic programming for real-time control and neural modeling
    • editors White and Sofge, Chapter 13
    • P. J. Werbos, "Approximating dynamic programming for real-time control and neural modeling." Handbook of Intelligent Control, editors White and Sofge, Chapter 13, pp. 493-525, 1992.
    • (1992) Handbook of Intelligent Control , pp. 493-525
    • Werbos, P.J.1
  • 12
    • 0000255539 scopus 로고
    • Fast exact multiplication by the Hessian
    • B. A. Pearlmutter, "Fast exact multiplication by the Hessian," Neural Computation, vol. 6, no. 1, pp. 147-160, 1994.
    • (1994) Neural Computation , vol.6 , Issue.1 , pp. 147-160
    • Pearlmutter, B.A.1
  • 13
    • 0008011457 scopus 로고
    • Neural networks, system identification, and control in the chemical process industries
    • Chapter 10
    • P. J. Werbos, "Neural networks, system identification, and control in the chemical process industries." Handbook of Intelligent Control, editors White and Sofge, Chapter 10, pp. 283-356, 1992.
    • (1992) Handbook of Intelligent Control, Editors White and Sofge , pp. 283-356
    • Werbos, P.J.1
  • 15
    • 33646384929 scopus 로고    scopus 로고
    • Policy gradient in continuous time
    • R. Munos, "Policy gradient in continuous time," Journal of Machine Learning Research, vol. 7, pp. 413-427, 2006.
    • (2006) Journal of Machine Learning Research , vol.7 , pp. 413-427
    • Munos, R.1
  • 16
    • 0033629916 scopus 로고    scopus 로고
    • Reinforcement learning in continuous time and space
    • K. Doya, "Reinforcement learning in continuous time and space," Neural Computation, vol. 12, no. 1, pp. 219-245, 2000.
    • (2000) Neural Computation , vol.12 , Issue.1 , pp. 219-245
    • Doya, K.1
  • 19
    • 0037561866 scopus 로고    scopus 로고
    • Dual heuristic programming excitation neurocontrol for generators in a multimachine power system
    • G. K. Venayagamoorthy and D. C. Wunsch, "Dual heuristic programming excitation neurocontrol for generators in a multimachine power system," IEEE Transactions on Industry Applications, vol. 39, pp. 382- 394, 2003.
    • (2003) IEEE Transactions on Industry Applications , vol.39 , pp. 382-394
    • Venayagamoorthy, G.K.1    Wunsch, D.C.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.