메뉴 건너뛰기




Volumn 40, Issue 3, 2002, Pages 681-698

Learning algorithms for Markov decision processes with average cost

Author keywords

Average cost control; Controlled Markov chains; Dynamic programming; Q learning; Simulation based algorithms; Stochastic approximation

Indexed keywords

COMPUTER SIMULATION; COSTS; DECISION THEORY; DYNAMIC PROGRAMMING; LEARNING ALGORITHMS; OPTIMAL CONTROL SYSTEMS;

EID: 0036287773     PISSN: 03630129     EISSN: None     Source Type: Journal    
DOI: 10.1137/S0363012999361974     Document Type: Article
Times cited : (190)

References (29)
  • 20
    • 0029752592 scopus 로고    scopus 로고
    • Average reward reinforcement learning: Foundations, algorithms and empirical results
    • (1996) Machine Learning , vol.22 , pp. 1-38
    • Mahadevan, S.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.