메뉴 건너뛰기




Volumn 27, Issue 2, 2002, Pages 294-311

Q-learning for risk-sensitive control

Author keywords

Dynamic programming; Markov decision processes; Q learning; Reinforcement learning; Risk sensitive control; Stochastic approximation

Indexed keywords

BOUNDARY CONDITIONS; COMPUTER SIMULATION; CONVERGENCE OF NUMERICAL METHODS; DECISION THEORY; DYNAMIC PROGRAMMING; LEARNING ALGORITHMS; LEARNING SYSTEMS; MATRIX ALGEBRA; ORDINARY DIFFERENTIAL EQUATIONS; RISK ASSESSMENT; THEOREM PROVING;

EID: 0036577013     PISSN: 0364765X     EISSN: None     Source Type: Journal    
DOI: 10.1287/moor.27.2.294.324     Document Type: Article
Times cited : (156)

References (35)
  • 31
    • 0003787427 scopus 로고    scopus 로고
    • Learning and value function approximation in complex decision processes
    • LIDS-TH 2420, Ph.D. thesis, Lab. for Information and Decision Systems, M.I.T., Cambridge, MA
    • (1998)
    • Van Roy, B.1
  • 32
    • 0009709497 scopus 로고    scopus 로고
    • Neuro-dynamic programming: Overview and recent trends
    • A. Shwartz, E. A. Feinberg, eds.; Kluwer Academic Publishers, Boston. Forthcoming
    • (2000) Markov Decision Processes
    • Van Roy, B.1
  • 33
    • 0004049893 scopus 로고
    • Learning from delayed rewards
    • Ph.D. thesis, Cambridge University, Cambridge, U.K.
    • (1989)
    • Watkins, C.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.